专利摘要:
The present invention relates to an image encoder, which includes a circuit and a memory coupled to the circuit. The circuit, in operation, performs: splitting an image block into a plurality of partitions that includes a first partition that has a non-rectangular shape (for example, a triangular shape) and a second partition; predict a first motion vector for the first partition and a second motion vector for the second partition; and encoding the first partition using the first motion vector and the second partition using the second motion vector.
公开号:BR112020001991A2
申请号:R112020001991-7
申请日:2018-08-10
公开日:2020-08-18
发明作者:Kiyofumi Abe;Jing Ya LI;Takahiro Nishi;Tadamasa Toma;Ryuichi KANOH;Chong Soon Lim;Ru Ling LIAO;Hai Wei Sun;Sughosh Pavan SHASHIDHAR;Han Boon Teo
申请人:Panasonic Intellectual Property Corporation Of America;
IPC主号:
专利说明:

[0001] [0001] This description refers to video encoding, and specifically to systems, components, and video encoding and decoding methods to perform an interpreting function to build a current block based on a frame of reference or an intrapredictive function to build a current block based on a coded / decoded reference block in a current frame. PREVIOUS TECHNIQUE
[0002] [0002] With the advancement in video encoding technology, from H.261 and MPEG-1 to H.264 / AVC (Advanced Video Encoding), MPEG-LA, H.265 / HEVC (High Efficiency Video Encoding) ) and H.266 / VVC (Versatile Video Codec), there remains a constant need to provide improvements and optimizations in video encoding technology to process an ever-increasing amount of digital video data in various applications. This description refers to further advances, refinements and optimizations in video encoding, specifically, in connection with an interpretation function or an intraprediction function, dividing an image block into a plurality of partitions that includes at least one first partition that has a non-rectangular shape (for example, a triangle) and a second partition. SUMMARY OF THE INVENTION
[0003] [0003] According to one aspect, an image encoder is provided that includes a circuit and a memory coupled to the circuit. The circuit, in operation, performs: dividing an image block into a plurality of partitions that includes a first partition that has a non-rectangular shape and a second partition; predict a first motion vector for the first partition and a second motion vector for the second partition; and encode the first partition using the first motion vector and the second partition using the second motion vector.
[0004] [0004] Some implementations of the modalities of the present description can improve an encoding efficiency, they can simply be an encoding / decoding process, they can accelerate an encoding / decoding process speed, they can efficiently select appropriate components / operations used in encoding and decoding such as an appropriate filter, block size, motion vector, reference image, reference block, etc.
[0005] [0005] Additional benefits and advantages of the described modalities will be apparent from the specification and drawings. The benefits and / or advantages can be individually obtained by the various modes and characteristics of the specification and drawings, not all of which need to be provided in order to obtain one or more of such benefits and / or advantages.
[0006] [0006] It should be noted that general or specific modalities can be implemented as a system, a method, an integrated circuit, a computer program, a storage medium, or any selective combination thereof. BRIEF DESCRIPTION OF THE DRAWINGS
[0007] [0007] Figure 1 is a block diagram that illustrates a functional configuration of an encoder according to a modality.
[0008] [0008] Figure 2 illustrates an example of block division.
[0009] [0009] Figure 3 is a table that indicates basic transform functions of various types of transform.
[0010] [0010] Figure 4A illustrates an example of a form of filter used in ALF (adaptive loop filter).
[0011] [0011] Figure 4B illustrates another example of a form of filter used in ALF.
[0012] [0012] Figure 4C illustrates another example of a form of filter used in ALF.
[0013] [0013] Figure 5A illustrates 67 intraprediction modes used in an intraprediction example.
[0014] [0014] Figure 5B is a flowchart that illustrates an example of a prediction image correction process performed in OBMC (overlapping block movement compensation) processing.
[0015] [0015] Figure 5C is a conceptual diagram that illustrates an example of a prediction image correction process performed in OBMC processing.
[0016] [0016] Figure 5D is a flow chart that illustrates an example of FRUC processing (upward conversion of frame rate).
[0017] [0017] Figure 6 illustrates an example of pattern matching (bilateral matching) between two blocks along a movement path.
[0018] [0018] Figure 7 illustrates an example of pattern matching (model matching) between a model in the current image and a block in a reference image.
[0019] [0019] Figure 8 illustrates a model that assumes uniform | constant movement.
[0020] [0020] Figure 9A illustrates an example of deriving a motion vector from each sub-block based on motion vectors from neighboring blocks.
[0021] [0021] Figure 9B illustrates an example of a process for deriving a motion vector in blending mode.
[0022] [0022] Figure 9 € C is a conceptual diagram that illustrates an example of DMVR processing (dynamic motion vector reset).
[0023] [0023] Figure 9D illustrates an example of a prediction image generation method using a luminance correction process performed by LIC (local lighting compensation) processing.
[0024] [0024] Figure 10 is a block diagram that illustrates a functional configuration of the decoder according to a modality.
[0025] [0025] Figure 11 is a flow chart illustrating a general process flow of dividing an image block into a plurality of partitions that includes at least one first partition that has a non-rectangular shape (for example, a triangle) and a second partition and that performs additional processing according to a modality.
[0026] [0026] Figure 12 illustrates two exemplary methods of dividing an image block into a first partition that has a non-rectangular shape (for example, a triangle) and a second partition (also having a non-rectangular shape in the illustrated examples).
[0027] [0027] Figure 13 illustrates an example of a limit smoothing process that involves weighting first predicted limit pixel values based on the first partition and second predicted limit pixel values based on the second partition.
[0028] [0028] Figure 14 illustrates three additional samples of a limit smoothing process that involves weighing first predicted limit pixel values based on the first partition and second predicted limit pixel values based on the second partition.
[0029] [0029] Figure 15 is a table of sample parameters ("first index values") and sets of information respectively coded by the parameters.
[0030] [0030] Figure 16 is a table that illustrates trivialization of parameters
[0031] [0031] Figure 17 is a flow chart illustrating a process of dividing an image block into a plurality of partitions that includes a first partition that has a non-rectangular shape and a second partition.
[0032] [0032] Figure 18 illustrates examples of dividing an image block into a plurality of partitions that includes a first partition that has a non-rectangular shape, which is a triangle in the illustrated examples, and a second partition.
[0033] [0033] Figure 19 illustrates additional examples of dividing an image block into a plurality of partitions that includes a first partition that has a non-rectangular shape, which is a polygon with at least five sides and angles in the illustrated examples, and a second partition.
[0034] [0034] Figure 20 is a flowchart that illustrates a limit smoothing process that involves weighing first predicted limit pixel values based on the first partition and second predicted limit pixel values based on the second partition.
[0035] [0035] Figure 21A illustrates an example of a threshold smoothing process in which threshold pixels for which first values to be weighted are predicted based on the first partition and second values to be weighted are predicted based on on the second partition.
[0036] [0036] Figure 21B illustrates an example of a threshold smoothing process in which threshold pixels for which first values to be weighted are predicted based on the first partition and second values to be weighted are predicted based on on the second partition.
[0037] [0037] Figure 21C illustrates an example of a limit smoothing process in which limit pixels for which the first values to be
[0038] [0038] Figure 21D illustrates an example of a threshold smoothing process in which threshold pixels for which first values to be weighted are predicted based on the first partition and second values to be weighted are predicted based on on the second partition.
[0039] [0039] Figure 22 is a flow chart illustrating a method performed on the encoder side of dividing an image block into a plurality of partitions that includes a first partition that has a non-rectangular shape and a second partition, based on in a partition parameter indicative of the division, and write one or more parameters that include the partition parameter in a bit stream in entropy coding.
[0040] [0040] Figure 23 is a flowchart illustrating a method performed on the decoder side of analyzing one or more parameters of a bit stream, which includes a partition parameter indicative of dividing an image block into one. plurality of partitions that include a first partition that has a non-rectangular shape and a second partition, and divide the image block into the plurality of partitions based on the partition parameter, and decode the first and second partitions partition.
[0041] [0041] Figure 24 is a table of sample partition parameters ("first index values") which respectively indicate the division of an image block into a plurality of partitions that includes a first partition that has a non-shape rectangular and a second partition, and sets of information that can be encoded together by the partition parameters, respectively.
[0042] [0042] Figure 25 is a table of sample combinations of a first parameter and a second parameter, one of which being a partition parameter indicative of dividing an image block into a plurality of partitions that includes a first partition that has a non-rectangular shape and a second partition.
[0043] [0043] Figure 26 illustrates a general configuration of a content delivery system to implement a content distribution service.
[0044] [0044] Figure 27 illustrates an example of a coding structure in scalable coding.
[0045] [0045] Figure 28 illustrates an example of a coding structure in scalable coding.
[0046] [0046] Figure 29 illustrates an example of a display screen for a webpage.
[0047] [0047] Figure 30 illustrates an example of a webpage display screen.
[0048] [0048] Figure 31 illustrates an example of a smartphone.
[0049] [0049] Figure 32 is a block diagram that illustrates an example of a smartphone configuration. DESCRIPTION OF MODALITIES
[0050] [0050] According to one aspect, an image encoder is provided that includes a circuit and a memory coupled to the circuit. The circuit, in operation, performs: dividing an image block into a plurality of partitions that includes a first partition that has a non-rectangular shape and a second partition; predict a first motion vector for the first partition and a second motion vector for the second partition; and encode the first partition using the first motion vector and the second partition using the second motion vector.
[0051] [0051] According to an additional aspect, the second partition has a non-rectangular shape. According to another aspect, the non-rectangular shape is a triangle. According to an additional aspect, the non-rectangular shape is selected from a group consisting of a triangle, a trapezoid, and a polygon with at least five sides and angles.
[0052] [0052] According to another aspect, the prediction includes selecting the first motion vector from a first set of motion vector candidates and selecting the second motion vector from a second set of motion vector candidates. For example, the first set of motion vector candidates can include motion vectors from partitions neighboring the first partition, and the second set of motion vector candidates can include motion vectors from partitions neighboring the second partition. Partitions neighboring the first partition and partitions neighboring the second partition can be outside the image block from which the first partition and the second partition are divided. Neighboring partitions can be one or both of spatially neighboring partitions and neighboring temporary partitions. The first set of motion vector candidates can be the same as, or different from, the second set of motion vector candidates.
[0053] [0053] According to another aspect, the prediction includes, selecting a first motion vector candidate from a first set of motion vector candidates and deriving the first motion vector by adding a first motion vector difference to the first candidate motion vector, and selecting a second motion vector candidate from a second set of motion vector candidates and deriving the second motion vector by adding a second motion vector difference to the second motion vector candidate.
[0054] [0054] According to another aspect, an image encoder is provided that includes: a splitter which, in operation, receives and divides
[0055] [0055] According to another aspect, an image encoding method is provided, which generally includes three steps: dividing an image block into a plurality of partitions that includes a first partition that has a non-rectangular shape and a second partition ; predict a first motion vector for the first partition and a second motion vector for the second partition; and encoding the first partition using the first motion vector and the second partition using the second motion vector.
[0056] [0056] According to another aspect, an image decoder is provided which includes a circuit and a memory attached to the circuit. The circuit, in operation, performs: dividing an image block into a plurality of partitions that includes a first partition that has a non-rectangular shape and a second partition; predict a first motion vector for the first partition and a second motion vector for the second partition; and decoding the first partition using the first motion vector and the second partition using the second motion vector.
[0057] [0057] According to an additional aspect, the second partition has a non-rectangular shape. According to another aspect, the non-rectangular shape is a triangle. According to an additional aspect, the non-rectangular shape is selected from a group consisting of a triangle, a trapezoid, and a polygon with at least five sides and angles.
[0058] [0058] According to another aspect, an image decoder is provided that includes: an entropy decoder which, in operation, receives and decodes an encoded bit stream to obtain quantized transform coefficients; a reverse quantizer and transformer which, in operation, reverse quantizes the transform coefficients quantized to obtain transform coefficients and reverse transforms the transform coefficients to obtain residues; an adder which, in operation, adds the residues emitted from the quantizer and inverse transformer and predictions emitted from a prediction controller to reconstruct the blocks; and the prediction controller coupled to an interpreter, an intrapredictor, and a memory, in which the interpreter, in operation, generates a prediction of a current block based on a reference block in a decoded reference image and the intrapredictor, in operation, generates a prediction of a current block based on a reference block decoded in a current image. The prediction controller, in operation, divides an image block into a plurality of partitions that includes a first partition that has a non-rectangular shape and a second partition; predicts a first motion vector for the first partition and a second motion vector for the second partition; and it decodes the first partition using the first motion vector and the second partition using the second motion vector.
[0059] [0059] According to another aspect, an image decoding method is provided, which generally includes three steps: dividing an image block into a plurality of partitions that includes a first partition that has a non-rectangular shape and a second partition ; predict a first motion vector for the first partition and a second motion vector for the second partition; and decode the first partition using the first motion vector and the second partition using the second motion vector.
[0060] [0060] According to one aspect, an image encoder is provided that includes a circuit and a memory coupled to the circuit. The circuit, in operation, performs a smoothing operation along a boundary between a first partition that has a non-rectangular shape and a second partition that are divided from an image block. The threshold smoothing operation includes: first prediction of the first values of a set of pixels from the first partition along the limit, using information from the first partition; second prediction of second values of the pixel set of the first partition along the limit, using information from the second partition; weight the first values and the second values; and encode the first partition using the first weighted values and the second weighted values.
[0061] [0061] According to an additional aspect, the non-rectangular shape is a triangle. According to another aspect, the non-rectangular shape
[0062] [0062] According to another aspect, at least one of the first prediction and the second prediction is an interpretation process that predicts the first values and the second values based on a reference partition in an encoded reference image. The interpretation process can predict the first pixel values of the first partition that includes the pixel set and can predict the second values of only the pixel set of the first partition.
[0063] [0063] According to another aspect, at least one of the first prediction and the second prediction is an intraprediction process that predicts the first values and the second values based on a reference partition encoded in a current image.
[0064] [0064] According to another aspect, a prediction method used in the first prediction is different from a prediction method used in the second prediction.
[0065] [0065] According to an additional aspect, a number of the set of pixels of each row or each column, for which the first values and the second values are predicted, is an integer. For example, when the number of pixels in each row or column is four, weights of 1/8, 1/4, 3/4, and 7/8 can be applied to the first values of the four pixels in the set, respectively , and weights of 7/8, 3/4, 1/4, and 1/8 can be applied to the second values of the four pixels in the set, respectively. As another example, when the number of pixels in each row or column is two, weights of 1/3 and 2/3 can be applied to the first values of the two pixels in the set, respectively, and weights of 2/3 and 1/3 can be applied to the second values of the two pixels in the set, respectively.
[0066] [0066] According to another aspect, the weightings can be integer values or they can be fractional values.
[0067] [0067] According to another aspect, an image encoder is provided that includes: a divider which, in operation, receives and divides an original image into blocks; an adder which, in operation, receives the divider blocks and predictions from a prediction controller, and subtracts each prediction from its corresponding block to emit a residue; a transformer which, in operation, performs a transform on the waste emitted from the adder to emit transform coefficients; a quantizer which, in operation, quantizes the transform coefficients to generate quantized transform coefficients; an entropy encoder which, in operation, encodes the quantized transform coefficients to generate a bit stream; and the prediction controller coupled to an interpreter, an intrapredictor, and a memory, in which the interpreter, in operation, generates a prediction of a current block based on a reference block in a coded reference image and the intrapredictor, in operation, generates a prediction of a current block based on a reference block encoded in a current image. The prediction controller, in operation, performs a limit smoothing operation along a boundary between a first partition that has a non-rectangular shape and a second partition that are divided from an image block. The threshold smoothing operation includes: first prediction of the first values of a set of pixels from the first partition along the limit, using information from the first partition; second prediction of second values of the pixel set of the first partition along the limit, using information from the second partition; weight the first values and the second values; and encode the first partition using the first weighted values and the second weighted values.
[0068] [0068] According to another aspect, an image encoding method is provided to perform a limit smoothing operation along a limit between a first partition that has a non-rectangular shape and a second partition that are divided from one block of image. The method generally includes four steps: first prediction of the first values of a set of pixels from the first partition along the boundary, using information from the first partition; second prediction of second values of the pixel set of the first partition along the limit, using information from the second partition; weight the first values and the second values; and encode the first partition using the first weighted values and the second weighted values.
[0069] [0069] According to an additional aspect, an image decoder is provided which includes a circuit and a memory coupled to the circuit. The circuit, in operation, performs a limit smoothing operation along a limit between a first partition that has a non-rectangular shape and the second partition that are divided from an image block. The limit smoothing operation includes: first prediction of the first values of a set of pixels from the first partition along the limit, using information from the first partition; second prediction of second values of the pixel set of the first partition along the limit, using information from the second partition; weight the first values and the second values; and decode the first partition using the first weighted values and the second weighted values.
[0070] [0070] According to another aspect, the non-rectangular shape is a triangle. According to an additional aspect, the non-rectangular shape is selected from a group consisting of a triangle, a trapezoid, and a polygon with at least five sides and angles. According
[0071] [0071] According to another aspect, at least one of the first prediction and the second prediction is an interpretation process that predicts the first values and the second values based on a reference partition in an encoded reference image. The interpretation process can predict the first pixel values of the first partition that include the pixel set and can predict the second values of only the pixel set of the first partition.
[0072] [0072] According to another aspect, at least one of the first prediction and the second prediction is an intraprediction process that predicts the first values and the second values based on a reference partition encoded in a current image.
[0073] [0073] According to another aspect, an image decoder is provided that includes: an entropy decoder which, in operation, receives and decodes an encoded bit stream to obtain quantized transform coefficients; a reverse quantizer and transformer which, in operation, reverse quantizes the transform coefficients quantized to obtain transform coefficients and reverse transforms the transform coefficients to obtain residues; an adder which, in operation, adds the residues emitted from the quantizer and inverse transformer and predictions emitted from a prediction controller to reconstruct the blocks; and the prediction controller coupled to an interpreter, an intrapredictor, and a memory, in which the interpreter, in operation, generates a prediction of a current block based on a reference block in a decoded reference image and the intrapredictor, in operation, generates a prediction of a current block based on a reference block decoded in a current image. The prediction controller, in operation, runs
[0074] [0074] According to another aspect, an image decoding method is provided to perform a limit smoothing operation along a limit between a first partition that has a non-rectangular shape and a second partition that are divided from one block of image. The method generally includes four steps: first prediction of the first values of a set of pixels from the first partition along the boundary, using information from the first partition; second prediction of second values of the pixel set of the first partition along the limit, using information from the second partition; weight the first values and the second values; and decode the first partition using the first weighted values and the second weighted values.
[0075] [0075] According to one aspect, an image encoder is provided that includes a circuit and a memory coupled to the circuit. The circuit, in operation, performs a partition syntax operation that includes: dividing an image block into a plurality of partitions that includes a first partition that has a non-rectangular shape and a second partition based on a parameter partition code indicative of the division; encode the first partition and the second partition; and write one or more parameters that include the partition parameter in a bit stream.
[0076] [0076] According to an additional aspect, the partition parameter indicates that the first partition has a triangle shape.
[0077] [0077] According to another aspect, the partition parameter indicates that the second partition has a non-rectangular shape.
[0078] [0078] According to another aspect, the partition parameter indicates that the non-rectangular shape is one of a triangle, a trapezoid, and a polygon with at least five sides and angles.
[0079] [0079] According to another aspect, the partition parameter together encodes a division direction applied to divide the image block into the plurality of partitions. For example, the direction of view may include: from an upper left corner of the image block to its lower right corner, and from an upper right corner of the image block to its lower left corner. The partition parameter can together encode at least a first motion vector from the first partition.
[0080] [0080] According to another aspect, the one or more parameters other than the partition parameter encode a division direction applied to divide the image block into the plurality of partitions. The parameter encoding the division direction can together encode at least one first motion vector from the first partition.
[0081] [0081] According to another aspect, the partition parameter can together encode at least one first motion vector of the first partition. The partition parameter can together encode a second motion vector for the second partition.
[0082] [0082] According to another aspect, the one or more parameters other than the partition parameter can encode at least one first motion vector of the first partition.
[0083] [0083] According to another aspect, the one or more parameters are binarized in accordance with a binarization scheme which is selected depending on a value of at least one of the one or more parameters.
[0084] [0084] In accordance with an additional aspect, an image encoder is provided that includes: a divider which, in operation, receives and divides an original image into blocks; an adder which, in operation, receives the divider blocks and predictions from a prediction controller, and subtracts each prediction from its corresponding block to emit a residue; a transformer which, in operation, performs a transform on the waste emitted from the adder to emit transform coefficients; a quantizer which, in operation, quantizes the transform coefficients to generate quantized transform coefficients; an entropy encoder which, in operation, encodes the quantized transform coefficients to generate a bit stream; and the prediction controller coupled to an interpreter, an intrapredictor, and a memory, in which the interpreter, in operation, generates a prediction of a current block based on a reference block in a reference image coded and the intra-predictor, in operation, generates a prediction of a current block based on a reference block encoded in a current image. The prediction controller, in operation, divides an image block into a plurality of partitions that includes a first partition that has a non-rectangular shape and a second partition based on a partition parameter indicative of the division, and encodes the first partition and the second partition. The entropy encoder, in operation, writes one or more parameters that include the partition parameter in a bit stream.
[0085] [0085] According to another aspect, an image encoding method that includes a partition syntax operation is provided. The method generally includes three steps: dividing an image block into a plurality of partitions that includes a first partition that has a non-rectangular shape and a second partition based on a partition parameter indicative of the division; encode the first partition and the second partition; and write one or more parameters that include the partition parameter in a bit stream.
[0086] [0086] According to another aspect, an image decoder is provided that includes a circuit and a memory coupled to the circuit. The circuit, in operation, performs a partition syntax operation that includes: analyzing one or more parameters of a bit stream, where the one or more parameters include a partition parameter indicative of dividing an image block into a plurality of partitions including a first partition that has a non-rectangular shape and a second partition; divide the image block into the plurality of partitions based on the partition parameter; and decode the first partition and the second partition.
[0087] [0087] According to an additional aspect, the partition parameter indicates that the first partition has a triangle shape.
[0088] [0088] According to another aspect, the partition parameter indicates that the second partition has a non-rectangular shape.
[0089] [0089] According to another aspect, the partition parameter indicates that the non-rectangular shape is one of a triangle, a trapezoid, and a polygon with at least five sides and angles.
[0090] [0090] According to another aspect, the partition parameter together encodes a division direction applied to divide the image block into the plurality of partitions. For example, the direction of view includes: from an upper left corner of the image block to its lower right corner, and from an upper right corner of the image block to its lower left corner. The partition parameter can together encode at least a first motion vector from the first partition.
[0091] [0091] According to another aspect, the one or more parameters other than the partition parameter encode a division direction applied to divide the image block into the plurality of partitions. The parameter encoding the division direction can together encode at least one first motion vector from the first partition.
[0092] [0092] According to another aspect, the partition parameter can together encode at least one first movement vector from the first partition. The partition parameter can together encode a second motion vector for the second partition.
[0093] [0093] According to another aspect, the one or more parameters other than the partition parameter can encode at least one first motion vector of the first partition.
[0094] [0094] According to another aspect, the one or more parameters are binarized according to a binary scheme which is selected depending on a value of at least one of the one or more parameters.
[0095] [0095] According to an additional aspect, an image decoder is provided that includes: an entropy decoder which, in operation, receives and decodes a coded bit stream to obtain quantized transform coefficients; an inverse quantizer and transformer which, in operation, quantizes inverse transform coefficients to obtain transform coefficients and transforms transform coefficients to obtain residues; an adder which, in operation, adds the residues emitted from the quantizer and inverse transformer and predictions emitted from a prediction controller to reconstruct the blocks; and the prediction controller coupled to an interpreter, an intrapredictor, and a memory, in which the interpreter, in operation, generates a prediction of a current block based on a reference block in a decoded reference image and the intrapredictor , in operation, generates a prediction of a current block based on a reference block decoded in a current image. The entropy decoder, in operation: analyzes one or more parameters of a bit stream, where the one or more parameters include a partition parameter indicating the division of an image block into a plurality of partitions that includes a first partition that has a non-rectangular shape and a second partition; divides the image block into the plurality of partitions based on the partition parameter; and decodes the first partition and the second partition.
[0096] [0096] According to another aspect, an image decoding method that includes a partition syntax operation is provided. The method generally includes three steps: analyzing one or more parameters of a bit stream, where the one or more parameters include a partition parameter indicative of dividing an image block into a plurality of partitions that includes a first partition that has a non-rectangular shape and a second partition; divide the image block into the plurality of partitions based on the partition parameter; and decode the first partition and the second partition.
[0097] [0097] In the drawings, identical reference numbers identify similar elements. The sizes and relative positions of elements in the drawings are not necessarily drawn to scale.
[0098] [0098] Hereafter, modality (s) will be described with reference to the drawings. Note that the modality (s) described below each show a general or specific example. The numerical values, shapes, materials, components, the arrangement and connection of the components, steps, the relation and order of the steps, etc., indicated in the following modality (s) are merely examples and are not intended limit the scope of the claims. Therefore, those components described in the following modality (s) but not recited in any of the independent claims that define the concepts
[0099] [0099] The modalities of an encoder and a decoder will be described low. The modalities are examples of an encoder and a decoder to which the processes and / or configurations presented in the description of aspects of the present description are applicable. The processes and / or configurations can also be implemented in an encoder and a decoder different from those according to the modalities. For example, referring to processes and / or configurations as applied in the modalities, any of the following can be implemented:
[00100] [00100] (1) Any of the components of the encoder or decoder according to the modalities presented in the description and aspects of the present description can be substituted or combined with another component presented anywhere in the description of aspects of the present description.
[00101] [00101] (2) In the encoder or decoder according to the modalities, arbitrary changes can be made to functions or processes performed by one or more components of the encoder or decoder, such as addition, replacement, removal, etc. , functions or processes. For example, any function or process can be replaced or combined with another function or process presented anywhere in the description of aspects of the present description.
[00102] [00102] (3) In the method implemented by the encoder or the decoder according to the modalities, arbitrary changes can be made such as addition, replacement, and removal of one or more of the processes included in the method. For example, any process in the method can be replaced or combined with another process presented anywhere in the description of aspects of the present description.
[00103] [00103] (4) One or more components included in the encoder or decoder according to the modalities can be combined with a component presented anywhere in the description of aspects of the present description, can be combined with a component that includes one or more functions presented anywhere in the description of aspects of the present description, and can be combined with a component that implements one or more processes implemented by a component presented in the description of aspects of the present description.
[00104] [00104] (5) A component that includes one or more functions of the computer or decoder according to the modalities, or a component that implements one or more processes of the encoder or decoder according to the modalities, can be combined or replaced by a component presented anywhere in the description of aspects of this description, with a component including one or more functions presented anywhere in the description of aspects of this description, or with a component that implements one or more processes presented in any local in the description of aspects of the present description.
[00105] [00105] (6) In the method implemented by the encoder or decoder according to the modalities, any of the processes included in the method can be replaced or combined with a process presented anywhere in the description of aspects of this description or with any corresponding or equivalent process.
[00106] [00106] (7) One or more processes included in the method implemented by the encoder or decoder according to the modalities can be combined with a process presented anywhere in the description of aspects of the present description.
[00107] [00107] (8) The implementation of the processes and / or configurations presented in the description of aspects of this description is not limited to the encoder or decoder according to the modalities. For example, the processes and / or configurations can be implemented on a device used for a purpose other than the mobile image encoder or mobile image decoder described in the modalities. (Encoder)
[00108] [00108] First, the decoder according to a modality will be described. Figure 1 is a block diagram illustrating a functional configuration of the encoder 100 according to the modality. Encoder 100 is a moving image encoder that encodes a moving image block by block.
[00109] [00109] As illustrated in Figure 1, encoder 100 is a device that encodes a block-by-block image, and includes a divider 102, subtractor 104, transformer 106, quantizer 108, encoder encoder 110, inverse quantizer 112 , reverse transformer 114, adder 116, block memory 118, loop filter 120, frame memory 122, intrapreditor 124, interpreditor 126, and prediction controller 128.
[00110] [00110] The encoder 100 is realized as, for example, a generic processor and memory. In this case, when a software program stored in memory is executed by the processor, the processor functions as divisor 102, subtractor 104, transformer 106, quantizer 108, entropy encoder 110, inverse quantizer 112, inverse transformer 114, adder 116, filter loop 120, intrapredictor 124, interpreditor 126, and prediction controller 128. Alternatively, encoder 100 can be realized as one or more dedicated electronic circuits corresponding to divider 102, subtractor 104, transformer 106 , quantizer 108, entropy encoder 110, inverse quantizer 112, reverse transformer 114, adder
[00111] [00111] From now on, each component included in encoder 100 will be described. (Divider)
[00112] [00112] Divider 102 divides each image included in a movable image inserted into blocks, and outputs each block to subtractor 104. For example, divider 102 first divides an image into blocks of a fixed size (for example, 128x128). The fixed-length block is also referred to as the encoding tree unit (CTU). Display 102 then divides each fixed-size block into blocks of varying sizes (for example, 64x64 or smaller), based on recursive quadtree block division and / or binary tree. The variable size block is also referred to as a coding unit (CU), a prediction unit (PU), or a transform unit (TU). In several implementations there may be no need to differentiate between CU, PU and TU; all of some of the blocks in an image can be processed by CU, PU or TU.
[00113] [00113] Figure 2 illustrates an example of block division according to a modality. In Figure 2, the solid lines represent block limits of blocks divided by block division of quadtree, and the dashed lines represent limits of block blocks divided by block division of binary tree.
[00114] [00114] Here, block 10 is a 128x128 pixel square block (block 128x128). This 128x128 10 block is first divided into four 64 * 64 square blocks (quadtree block division).
[00115] [00115] The upper left 64x64 block is additionally vertically divided into two rectangular 32x64 blocks, and the left 32x64 block is additionally vertically divided into two rectangular 16 * x64 blocks (binary tree block division). As a re-
[00116] [00116] Block 64 * 64 upper right is horizontally divided into two rectangular 64 * 32 blocks 14 and 15 (division of binary tree block).
[00117] [00117] The lower left 64 * 64 block is first divided into four 32x32 square blocks (quadtree block division). The upper left block and the lower right block between the four 32x32 blocks are further divided. The upper left 32x32 block is vertically divided into two rectangular 16x32 blocks, and the right 16x32 block is further horizontally divided into two 16x16 blocks (binary tree block division). The lower right 32x32 block is horizontally divided into two 32x16 blocks (binary tree block division). As a result, the lower left 64 * 64 block is divided into the 16x32 16 block, two 16x16 blocks 17 and 18, two 32x32 blocks 19 and 20, and two 32x16 blocks 21 and 22.
[00118] [00118] Block 64 * 64 lower right 23 is not divided
[00119] [00119] “As described above, in Figure 2, block 10 is divided into 13 blocks of variable size 11 to 23 based on division of recursive quadtree block and binary tree. This type of division is also referred to as quadtree plus binary tree division (QTBT).
[00120] [00120] Note that in Figure 2, a block is divided into four or two blocks (quadtree block or binary tree division), the division is not limited to these examples. For example, a block can be divided into three blocks (ternary block division). The division that includes such a ternary block division is also referred to as a multiple-type tree division (MBT). (Subtractor)
[00121] [00121] Subtractor 104 subtracts a prediction signal (prediction sample, inserted from prediction controller 128, to be described below)
[00122] [00122] The original signal is a signal inserted in the encoder 100, and it is a signal that represents an image for each image included in a moving image (for example, a luma signal and two chroma signals). From now on, a sign representing an image is also referred to as a sample. (Transformer)
[00123] [00123] Transformer 106 transforms spatial domain prediction errors into frequency domain transform coefficients, and outputs the transform coefficients to the quantizer
[00124] [00124] Note that transformer 106 can adaptively select a type of transform from a plurality of types of transform, and transforms prediction errors into transform coefficients using a transform base function that matches the type of transform. selected transform. This type of transform is also referred to as explicit multiple core transform (EMT) or adaptive multiple transform (AMT).
[00125] [00125] Transform types include, for example, DCT-II, DCT-V, DCT-VIII, DST-I, and DST-VII. Figure 3 is a graph that indicates basic transform functions for each type of transform. In Figure 3, N indicates the number of input pixels. For example, selecting
[00126] [00126] Information indicating whether to apply such EMT or AMT (referred to, for example, an AMT flag or an AMT flag) and information indicating the type of transform selected are typically signaled at the CU level . Note that the signaling of such information does not have to be performed at the CU level, and can be performed at another level (for example, at the bit stream level, image level, slice level, side-by-side level, or level of CTU).
[00127] [00127] Furthermore, transformer 106 can apply a secondary transform to the transform coefficients (result of transform). Such a secondary transform is also referred to as an adaptive secondary transform (AST) or a non-separable secondary transform (NSST). For example, transformer 106 applies a secondary transform to each sub-block (for example, each 4x4 sub-block) included in the block of transform coefficients that correspond to intraprediction errors. The information that indicates whether to apply NSST and information regarding the transform matrix used in NSST are typically signaled at the CU level. Note that the signaling of such information does not have to be performed at the CU level, and can be performed at another level (for example, at the sequence level, image level, slice level, side-by-side level, or CTU level) .
[00128] [00128] Either a separate transform or a non-separable transform can be applied to transformer 106. A separate transform is a method in which a transform is performed a plurality of times separately by executing a transform for each direction according to the number of dimensions entered .
[00129] [00129] In an example of a non-separable transform, when the input is a 4 * 4 block, the 4x4 block is considered as a single network that includes 16 components, and the transform applies a 16x16 transform matrix to the network.
[00130] [00130] In an additional example of a non-separable transform, after the inserted 4 * 4 block is considered as a single network that includes 16 components, a transform that performs a plurality of Givens rotations (for example, a Transform Hypercube-Givens) can be applied over the network. (Quantizer)
[00131] [00131] Quantizer 108 quantizes the transformation coefficients emitted from transformer 106. More specifically, quantizer 108 scans, in a predetermined scan order, the transform coefficients of the current block, and quantizes the coefficients of scanned based on quantization parameters (QP) that correspond to the transform coefficients. Quantizer 108 then outputs the quantized transform coefficients (hereinafter referred to as the quantized coefficients) of the current block to the entropy encoder 110 and inverse quantizer 112.
[00132] [00132] A predetermined scan order is an order for reverse quantization / quantization of transformation coefficients. For example, a predetermined scan order is defined as an ascending frequency order (low to high frequency) or descending frequency order (high to low frequency).
[00133] [00133] A quantization parameter (QP) is a parameter that defines a quantization step size (quantization width). For example, if the value of the quantization parameter increases, the size of the quantization step also increases. In other words, if the value of the quantization parameter increases, the quantization error increases. (Entropy Encoder)
[00134] [00134] Entropy encoder 110 generates an encoded signal (encoded bit stream) based on quantized coefficients, which are inserted from quantizer 108. More specifically, for example, entropy encoder 110, binarizes quantized coefficients and encodes the binary signal in arithmetic to output a bit stream or compressed sequence. (Inverse quantizer)
[00135] [00135] The inverse quantizer 112 quantizes inverse quantized coefficients, which are inserted from quantizer 108. More specifically, the inverse quantizer 112 quantizes inverse, in a predetermined scanning order, quantized coefficients of the current block. The inverse quantizer 112 then outputs the inverse quantized transform coefficients from the current block to the inverse transformer 114. (Inverse transformer)
[00136] [00136] Inverse transformer 114 restores prediction errors (residuals) by transforming inverse transform coefficients, which are inserted from the inverse quantizer 112. More specifically, the inverse transformer 114 restores the prediction errors of the current block by applying an inverse transform that corresponds to the transform applied by transformer 106 on the transform coefficients. The reverse transformer 114 then issues the predicted errors restored to the adder 116.
[00137] [00137] Note that, typically, as information is lost in quantization, the prediction errors restored do not coincide with the prediction errors calculated by subtractor 104. In other words, the prediction errors restored typically include quantization errors. (Adder)
[00138] [00138] Adder 116 reconstructs the current block by adding prediction errors, which are inserted from the reverse transformer 114, and prediction samples, which are inserted from the prediction controller 128. Adder 116 then issues the reconstructed block to block memory 118 and loop filter 120. A reconstructed block is also referred to as a local decoded block. (Block memory)
[00139] [00139] Block memory 118 is a storage for storing blocks in an image to be encoded (hereinafter referred to as a "current image") for reference in intraprediction, for example. More specifically, block memory 118 stores the reconstructed blocks emitted from adder 116. (Loop filter)
[00140] [00140] The ffiltrode loop 120 applies a loop filter to blocks rebuilt by the adder 116, and emits the filtered reconstructed blocks to the frame memory 122. A loop filter is a filter used in a coding loop (filter in loop), and includes, for example, an unlock filter (DF), an adaptive sample offset (SAO), and an adaptive loop filter (ALF).
[00141] [00141] In ALF, a smaller square error filter to remove compression artifacts is applied. For example, a filter from a plurality of filters is selected for each 2x2 sub-block in the current block based on the direction and activity of local gradients, and is applied.
[00142] [00142] More specifically, first, each sub-block (for example, each 2x2 sub-block) is categorized into one of a plurality of classes (for example, 15 or 25 classes). The sub-block classification is based on directionality and gradient activity. For example, the classification index C is derived based on gradient directionality D (for example, O to 2 or 0 to 4) and gradient activity A (for example, O to 4) (for example, C = 5D + A). Then, based on the C rating index, each sub-block is categorized into one of a plurality of classes (for example, 15 or 25 classes).
[00143] [00143] For example, the directionality of gradient D is calculated by comparing gradients from a plurality of directions (for example, the horizontal, vertical, and two diagonal directions). Furthermore, for example, gradient activity A is calculated by adding gradients from a plurality of directions and quantizing the sum.
[00144] [00144] The ffilttoa to be used for each sub-block is determined among the plurality of filters based on the result of such categorization.
[00145] [00145] The filter form to be used in ALF is, for example, a circular symmetric filter form. Figures 4A, 4B, and 4C illustrate examples of filter forms used in ALF. Figure 4A illustrates a 5x5 diamond shape filter, Figure 4B illustrates a 7x7 diamond shape filter, and Figure 4C illustrates a 9x9 diamond shape filter. The information indicating the form of the filter is typically signaled at the image level. Note that signaling information indicating the form of filter does not need to be performed at the image level, and can be performed at another level (for example, at the sequence level, slice level, side by side level, level of CTU, or CU level).
[00146] [00146] The enabling or disabling of ALF can be determined
[00147] [00147] The adjusted coefficients for the plurality of selectable filters (for example, 15 or 25 filters) are signaled at the image level. Note that the signaling of the adjusted coefficients does not need to be performed at the image level, and can be performed at another level (for example, at the sequence level, slice level, side by side level, CTU level, CU level , or sub-block level). (Frame Memory)
[00148] [00148] Frame memory 122 is a storage for storing reference images used in interpretation, for example, and is also referred to as a temporary frame storage. More specifically, frame memory 122 stores reconstructed blocks filtered by loop filter 120. (Intrapreditor)
[00149] [00149] Intrapredictor 124 generates a prediction signal (interpreter signal) intrapredicting the current block with reference to a block or blocks in the current image and stored in block memory 118 (also referred to as intraframe prediction). More specifically, intrapredictor 124 generates an intrapredictive intrapredictive signal with reference to samples (for example, luma and / or chroma values) from a block or blocks neighboring the current block, and then
[00150] [00150] - For example, intrapredictor 124 performs intrapredicting using a mode among a plurality of predefined intraprediction modes. Intraprediction modes typically include one or more non-directional prediction modes and a plurality of directional prediction modes.
[00151] [00151] The one or more non-directional prediction modes include, for example, flat prediction mode and DC prediction mode defined in the H.265 / HEVC standard.
[00152] [00152] The plurality of directional prediction modes includes, for example, the 33 directional prediction modes defined in the H.265 / HEVC standard. Note that the plurality of directional prediction modes can also include 32 directional prediction modes in addition to the 33 directional prediction modes (for a total of 65 directional prediction modes).
[00153] [00153] Figure SA illustrates 67 intraprediction modes used in intraprediction (two non-directional prediction modes and 65 directional prediction modes). The full arrows represent the 33 directions defined in the H.265 / HEVC standard, and the dashed arrows represent the 32 additional directions. (The two "non-directional" prediction modes are not shown in Figure 5A).
[00154] [00154] In various implementations, a luma block can be referred to in chroma block intraprediction. That is, a chroma component of the current block can be predicted based on a light component of the current block. Such intraprediction is also referred to as a linear cross-component model (CCLM) prediction. The chroma block intraprediction mode that references a luma block (referred to as, for example, CCLM mode) can be added as one of the chroma block intraprediction modes.
[00155] [00155] Intrapredictor 124 can correct post-pixel values
[00156] [00156] Interpreter 126 generates a prediction signal (interpretation signal) interpreting the current block with reference to a block or blocks in a reference image, which is different from the current image and is stored in the frame memory 122 (also referred to as interframe prediction). Interpretation is performed by the current block or by the current sub-block (for example, by 4x4 block) in the current block. For example, interpreter 126 performs movement estimation on a reference image for the current block or the current sub-block, to find a reference block or sub-block in the reference image that best matches the block. - current co or sub-block, and to obtain movement information (for example, a motion vector) that compensates (or predicts) the movement or change of the reference block or sub-block to the block or sub-block chain. Interpreter 126 then performs motion compensation (or motion prediction) based on the movement information, and generates an interpretation signal for the current block or sub-block based on the movement information. Interpreter 126 then outputs the generated interpreter signal to prediction controller 128.
[00157] [00157] The movement information used in motion compensation can be signaled in a variety of ways such as the interpretation signal. For example, a motion vector can be signaled. As another example, a difference between a motion vector and a motion vector predictor can be signaled.
[00158] [00158] Note that the interpretation signal can be generated using movement information for a neighboring block in addition to movement information for the current block obtained from movement estimation. More specifically, the interpretation signal can be generated by a sub-block in the current block by calculating a weighted sum of a prediction signal based on movement information obtained from movement estimate (in the reference image) and a prediction signal based on movement information from a neighboring block (in the current image). Such interpretation (motion compensation) is also referred to as overlapping block movement compensation (OBMC).
[00159] [00159] In the OBMC mode, the information indicating the sub-block size for OBMC (referred to as, for example, OBMC block size) can be signaled at the sequence level. However, information indicating whether to apply the OBMC mode or not (referred to as, for example, an OBMC flag) can be signaled at the CU level. Note that the signaling of such information does not have to be performed at the sequence level and CU level, and may be performed at another level (for example, at the image level, slice level, side by side level, CTU level , or sub-block level).
[00160] [00160] From now on, the OBMC mode will be described in more detail. Figure 5B is a flowchart and Figure 5C is a conceptual diagram to illustrate a prediction image correction process performed through OBMC processing.
[00161] [00161] Referring to Figure 5C, first, a prediction image
[00162] [00162] Next, a prediction image (Pred L) is obtained by applying (reusing) a motion vector (MV L), which has already been derived for the neighboring left block encoded for the target block (color - close), as indicated by an arrow "MV-L" that originates from the current block and pointing to the reference image to obtain the prediction image Pred L. Then the two prediction images Pred and Pred L are superimposed to perform a first step of correcting the prediction image, which in one aspect has the effect of mixing the border between neighboring blocks.
[00163] [00163] Similarly, a prediction image (Pred U) is obtained by applying (reusing) a motion vector (MV U) which has already been derived from the upper neighbor block encoded to the target block (current), as indicated by an arrow "MV U" originating from the current block and pointing to the reference image to obtain the Pred U prediction image. Then, the Pred U prediction image is overlaid with the prediction image that results in the first step (ie, Pred and Pred L) to perform a second step of correcting the prediction image, which in one aspect has the effect of mixing the border between neighboring blocks. The result of the second step is the final prediction image for the current block, with edges mixed (smoothed) with its neighboring blocks.
[00164] [00164] Note that the example above is a two-step correction method using neighboring left and top blocks, but the method can be a three-step or higher correction method that also uses the right block and / or lower neighbor.
[00165] [00165] Note that the region subject to overlap may be the entire pixel region of the block, and, alternatively, it may be a partial block boundary region.
[00166] [00166] Note that here, the OBMC prediction image correction process is described as being based on a single reference image to derive a single Pred prediction image, to which additional prediction images Pred L and Pred U they are overlapping, but the same process can apply to each of a plurality of reference images when the prediction image is corrected based on the plurality of reference images. In such a case, after a plurality of corrected prediction images is obtained by executing the OBMC image correction based on the plurality of reference images, respectively, the plurality of corrected images obtained is additionally superimposed to obtain the image of final prediction.
[00167] [00167] Note that in OBMC, the target block unit can be a prediction block and, alternatively, it can be a sub-block obtained additionally by dividing the prediction block.
[00168] [00168] An example of a method for determining whether to implement OBMC processing is by using an obmc flag, which is a signal that indicates whether to implement OBMC processing. As a specific example, the encoder can determine whether the target block belongs to a region that includes complicated movement. The encoder sets the obmc flag to a value of "1" when the block belongs to a region that includes complicated movement and implements OBMC processing when encoding, and sets the obmc flag to a value of "0" when the block does not belong to a region that includes a complication movement and encodes the block without implementing OBMC processing. The decoder switches between implementing OBMC processing or not decoding the obmc flag written in the stream (ie, the compressed sequence) and executing the decoding according to the flag value.
[00169] [00169] Note that the movement information can be derived on the decoder side without being signaled on the encoder side. For example, a blending mode defined in the H.265 / HEVC standard can be used. Furthermore, for example, motion information can be derived by performing motion estimation on the decoder side. In this case, the decoder side can perform the motion estimate without using the current block pixel values.
[00170] [00170] Here, a way to perform motion estimation on the decoder side will be described. A mode for performing motion estimation on the decoder side is also referred to as pattern-matched motion vector derivation mode (PMMVD) or upward frame rate conversion (FRUC) mode.
[00171] [00171] An example of FRUC processing is illustrated in Figure 5D. First, a candidate list (a candidate list can be a merge list) of candidates each including a motion prediction vector (MV), is generated with reference to motion vectors from encoded blocks that spatially or temporally approximate the current block. Next, the best candidate MV is selected from the plurality of candidate MVs registered in the candidate list. For example, the assessment values for the candidate VMs included in the candidate list are calculated and a candidate VM is selected based on the calculated evaluation values.
[00172] [00172] Next, a motion vector for the current block is derived from the motion vector of the selected candidate. More specifically, for example, the motion vector for the current block is calculated as the motion vector of the selected candidate
[00173] [00173] The same processes can be performed in cases where the processing is performed in units of sub-blocks.
[00174] [00174] An assessment value can be calculated in several ways. For example, a reconstructed image of a region in a reference image that corresponds to a motion vector is compared with a reconstructed image of a predetermined region (which can be in another reference image or in a neighboring block. in the current image, for example, as described below), and a difference in pixel values between the two reconstructed images can be calculated and used as an evaluation value of the motion vector. Note that the appraisal value can be calculated using some other information besides the difference.
[00175] [00175] Next, the pattern match is described in detail. First, a candidate MV included in a candidate list (for example, a merge list) is selected as the starting point for the pattern matching search. The pattern match used is either the first pattern match or the second pattern match. The first pattern match and the second pattern match are also referred to as bilateral match and model match, respectively.
[00176] [00176] In the first pattern match, the pattern match is executed between two blocks in two different reference images that are both along the movement path of the current block. Therefore, in the first pattern match, for a region in a reference image, a region in another reference image that conforms to the movement path of the current block is used as the predetermined region for the calculation described above. candidate's assessment value.
[00177] [00177] Figure 6 illustrates an example of the first pattern match (bilateral match) between two blocks in two reference images along a movement path. As illustrated in Figure 6, in the first pattern match, two motion vectors (MVO, MV1) are derived by finding the best match between the two blocks in two different reference images (Ref0, Ref1) along the movement path. of the current block (Cur block). More specifically, a difference can be obtained between (i) an image reconstructed at a position specified by a candidate MV in a first coded reference image (Ref0) and (ii) an image reconstructed at a position specified by the candidate MV, which it is symmetrically scaled by display time intervals, in a second coded reference image (Ref1). Then, the difference can be used to derive an evaluation value for the current block. A candidate VM that has the best appraisal value among a plurality of candidate VMs can be selected as the final VM.
[00178] [00178] Under the assumption of a continuous movement path, the movement vectors (MVO, MV1) that point to the two blocks
[00179] [00179] In the second pattern match (pattern match), the pattern match is performed between a model in the current image (blocks neighboring the current block in the current image; for example, the upper and / or left neighboring blocks) and a block in a reference image. Therefore, in the second pattern coincidence, a block next to the current block in the current image is used as the predetermined region for the calculation described above of the candidate evaluation value.
[00180] [00180] Figure 7 illustrates an example of pattern matching (model matching) between a model in a current image and a block in a reference image. As illustrated in Figure 7, in the second pattern match, a motion vector of the current block is derived by searching for a reference image (Ref0O) to find a block that best matches the neighboring block (s) of the current block (Cur block) in the current image (Cur Pic). More specifically, a difference can be obtained between (i) a reconstructed image of one or both neighboring upper and left regions coded in relation to the current block, and (ii) a reconstructed image of the same regions relative to a block position specified by the candidate MV in an encoded reference image (RefoO). Then the difference can be used to derive an assessment value for the current block. A candidate VM that has the best evaluation value among a plurality of candidate VMs can be selected as the best candidate MV.
[00181] [00181] Information indicating whether to apply the FRUC mode or not (referred to as, for example, a FRUC flag) can be signaled at the CU level. In addition, when the FRUC mode is applied (for example, when the FRUC flag is set to true), information indicating the applicable pattern matching method (first pattern match or second pattern match) can be signaled at the CU level. Note that the signaling of such information need not be performed at the CU level, and can be performed at another level (for example, at the sequence level, image level, slice level, side by side level, CTU level , or sub-block level).
[00182] [00182] The following are methods of deriving a motion vector. First, a description is given in a way to derive a motion vector based on a model assuming uniform linear motion. This mode is also referred to as a bidirectional optical flow (BIO) mode.
[00183] [00183] Figure 8 illustrates a model that assumes uniform linear motion. In Figure 8, (vx, v,) denotes a velocity vector, and 10 and 11 denote time distances between the current image (Cur Pic) and two reference images (Refo, Ref), respectively. (MVxo, MVyo) denotes a motion vector that corresponds to the reference image Refo, and (MVx1, MVy1) denotes a motion vector that corresponds to the reference image Ref.
[00184] [00184] - Here, under the assumption of uniform linear motion displayed by the velocity vector (vx, vy), (MVxo, MVyo) and (MVx1, MVy1) are represented as (vxto, Vyto) E (“VxT1, -Vyt1), respectively, and the following optical flow equation (Equation 1) is given. [MATH. 1] 61º / ar + v, or / ax + v, or / av = 0. O)
[00185] [00185] - Here, 1st denotes a luma value of the reference image k (k = 0, 1) after motion compensation. The optimal flow equation shows that the sum of (i) the time derivative of the luma value, (ii) the product of the horizontal velocity and the horizontal component of the spatial gradient of a reference image, and (iii ) the product of the vertical speed and the vertical component of the spatial gradient of a reference image is equal to zero. A motion vector for each block obtained from, for example, a merge list can be corrected pixel by pixel based on a combination of the optical flow equation and Hermite interpolation.
[00186] [00186] Note that a motion vector can be derived on the decoder side using a method other than deriving a motion vector based on a model assuming uniform linear motion. For example, a motion vector can be derived for each sub-block based on motion vectors from neighboring blocks.
[00187] [00187] The following is a description of how a motion vector is derived for each sub-block based on motion vectors of neighboring blocks. This mode is also referred to as the related motion compensation prediction mode.
[00188] [00188] Figure 9A illustrates an example of deriving a motion vector from each sub-block based on motion vectors from neighboring blocks. In Figure 9A, the current block includes 16 4x4 sub-blocks. Here, the motion vector vo from the upper left corner control point in the current block is derived based on motion vectors from neighboring sub-blocks. Similarly, the motion vector v1 of the upper right corner control point of the current block is derived based on motion vectors from neighboring blocks. Then, using the two motion vectors vo and v1, the motion vector (vx, vy) of each sub-block in the current block is derived using
[00189] [00189] Here, x and y are the horizontal and vertical positions of the sub-block, respectively, and w is a predetermined weighted coefficient.
[00190] [00190] “A related motion compensation prediction mode can include a number of modes of different methods of deriving the motion vectors from the upper left and upper right control points. Information indicating a mode of prediction of related motion compensation (referred to as, for example, a related flag) is signaled at the CU level. Note that signaling information indicating the mode of predicted motion compensation does not need to be performed at the CU level, and can be performed at another level (for example, the sequence level, image level, slice, side-by-side level, CTU level, or sub-block level). (Prediction Controller)
[00191] [00191] Prediction controller 128 selects either the interpreter signal (emitted from intrapredictor 124) or the interpreter signal (emitted from interpreritor 126), and emits the selected prediction signal for subtractor 104 and adder 116.
[00192] [00192] “As illustrated in Figure 1, in various implementations, the prediction controller 128 can emit prediction parameters, which are inserted in the entropy encoder 110. The entropy encoder 110 can generate a bit stream (or sequence) coded based on the prediction parameters, inserted from the prediction controller 128, and the quantized coefficients, inserted from the quantizer
[00193] [00193] Figure 9B illustrates an example of a process for deriving a motion vector in a current image in blending mode.
[00194] [00194] First, a prediction VM list is generated, in which the prediction VM candidates are registered. Examples of prediction VM candidates include: spatially neighboring prediction VMs, which are coded block VMs positioned in the spatial neighborhood of the target block; Temporally neighboring prediction VMs, which are block MVs in coded reference images that surround a block at the same location as the target block; a coupled prediction MV, which is a MV generated by combining the MV values of the spatially neighboring prediction MV and the temporally neighboring prediction MV; and a zero prediction MV, which is a MV whose value is zero.
[00195] [00195] Next, the MV of the target block is determined by selecting a prediction VM from the plurality of prediction VMs registered in the list of prediction VMs.
[00196] [00196] Furthermore, in a variable component encoder, an merge idx, which is a signal that indicates which prediction MV is selected, is written in the stream.
[00197] [00197] Note that the prediction VMs registered in the list of prediction VMs illustrated in Figure 9B are an example. The number of prediction VMs registered in the prediction VM list can be different from the number shown in Figure 9B, and the prediction VMs registered in the prediction VM list can omit one or more of the types of prediction VMs given in the example in Figure 9B, and the prediction VMs registered in the list of prediction VMs can include one or more types of prediction VMs in addition to and different from the types given in the example in Figure 9B.
[00198] [00198] The final MV can be determined by performing a DMVR (dynamic motion vector reset) processing (to be described later) using the derived block MV in the merge mode.
[00199] [00199] Figure 9 € is a conceptual diagram that illustrates an example of DMVR processing to determine a VM.
[00200] [00200] - First, the most appropriate MV which is adjusted for the current block (for example, in the blend mode) is considered to be the candidate MV. Then, according to candidate MV (LO), a reference pixel is identified in a first reference image (LO), which is an image encoded in the LO direction. Similarly, according to the candidate MV (L1), a reference pixel is identified in a second reference image (L1) which is an image encoded in the L1 direction. The reference pixels are averaged to form a model.
[00201] [00201] Next, using the model, the regions surrounding the candidate MVs of the first and second reference images (LO and L1) are searched, and the MV with the lowest cost is determined to be the final MV. The cost value can be calculated, for example, using the difference between each pixel value in the model and each pixel value in the surveyed regions, using the candidate VMs, etc.
[00202] [00202] “Note that the configuration and operation of the processes described here are fundamentally the same both on the encoder side and on the decoder side, to be described below.
[00203] [00203] “Any processing other than the processing described above can be used, as long as the processing is able to derive the final MV by researching the surroundings of the candidate MV.
[00204] [00204] The following is a description of an example of a method that generates a prediction image (a prediction) using LIC (local lighting compensation) processing.
[00205] [00205] Figure 9D illustrates an example of a prediction imaging method using a luminance correction process performed by LIC processing.
[00206] [00206] First, from an encoded reference image, an MV is derived to obtain a reference image that corresponds to the current block.
[00207] [00207] Next, for the current block, information indicating how the luminance value has changed between the reference image and the current image is obtained, based on the pixel values of brightness of the encoded neighboring left reference region and the neighboring upper reference region encoded in the current image, and based on the luminance pixel values at the same locations in the reference image as specified by the MV. Information indicating how the luminance value has changed is used to calculate a luminance correction parameter.
[00208] [00208] The prediction image for the current block is generated by performing a luminance correction process, which applies the luminance correction parameter over the reference image in the reference image specified by the MV.
[00209] [00209] Note that the shape of the surrounding reference region (s) illustrated in Figure 9D are just an example; the reference region
[00210] [00210] Furthermore, although a prediction image is generated from a single reference image in this example, in cases where a prediction image is generated from a plurality of reference images, the prediction image can be generated after performing a luminance correction process, as described above, on the reference images obtained from the reference images.
[00211] [00211] An example of a method for determining whether to implement LIC processing is using a lic flag, which is a sign that indicates whether to implement LIC processing. As a specific example, the encoder determines whether the current block belongs to a region of luminance change. The encoder sets the lic flag to a value of "1" when the block belongs to a region of luminance change and implements LIC processing when encoding. The encoder sets the lic flag to a value of "0" when the block does not belong to a region of luminance change, and performs coding by implementing LIC processing. The decoder can switch between implementing LIC processing or not decoding the lic flag written in the stream and executing the decoding according to the flag value.
[00212] [00212] An example of a different method of determining whether to implement LIC processing includes discerning whether LIC processing has been determined to be implemented for a surrounding block. In a specific example, when the blending mode is used over the current block, it is determined whether LIC processing was applied to the encoding of the surrounding coded block, which was selected when deriving the MV in the blending mode. Then, the determination is used to further determine whether to implement LIC processing or not for the current block. Note that in this example as well, the same applies to processing performed
[00213] [00213] Next, a decoder capable of decoding an encoded signal (encoded bit stream) emitted from encoder 100 will be described. Figure 10 is a block diagram illustrating a functional configuration of the decoder 200 according to a modality. Decoder 200 is a moving image decoder that decodes a moving image block by block.
[00214] [00214] As illustrated in Figure 10, decoder 200 includes an entropy decoder 202, reverse quantizer 204, reverse transformer 206, adder 208, block memory 210, loop filter 212, frame memory 214, intrapreditor 216, interpreditor 218 , and prediction controller 220.
[00215] [00215] Decoder 200 is realized as, for example, a generic processor and memory. In this case, when a software program stored in memory is executed by the processor, the processor works as the entropy decoder 202, inverse quantizer 204, inverse transformer 206, adder 208, loop filter 212, intrapreditor 216, interpreditor 218, and prediction controller 220. Alternatively, decoder 200 may be realized as one or more dedicated electronic circuits corresponding to entropy decoder 202, reverse quantizer 204, reverse transformer 206, adder 208, loop filter 212, intrapreditor 216, interpreditor 218 , and prediction controller 220.
[00216] [00216] From now on, each component included in decoder 200 will be described. (Entropy Decoder)
[00217] [00217] Entropy decoder 202 decodes an encoded bit stream. More specifically, for example, the entropy decoder 202 decodes bitstream encoded in a binary signal into arithmetic. The entropy decoder 202 then unwinds the binary signal. The entropy decoder 202 outputs quantized coefficients from each block to the inverse quantizer 204. The entropy decoder 202 can also output the prediction parameters, which can be included in the encoded bit stream (see Figure 1), for the intrapreditor 216, interpreditor 218, and prediction controller 220 so that they can perform the same prediction processing as performed on the encoder side on intrapreditor 124, inter-predictor 126, and prediction controller 128. (Inverse Quantizer)
[00218] [00218] The inverse quantizer 204 quantizes inverse quantized coefficients of a block to be decoded (hereinafter referred to as a current block), which are inserted from the entropy decoder 202. More specifically, the inverse quantizer 204 inverse quantity the quantized coefficients of the current block based on quantization parameters that correspond to the quantized coefficients. The inverse quantizer 204 then outputs the inverse quantized coefficients (that is, transform coefficients) from the current block to the reverse transformer 206. (Inverse transformer)
[00219] [00219] Inverse transformer 206 restores prediction errors (residuals) by transforming inverse transform coefficients, which are inserted from the inverse quantizer 204.
[00220] [00220] For example, when analyzed information from an encoded bit stream indicates EMT or AMT application (for example, when the AMT flag is set to true), the reverse transformer 206 transforms the transform coefficients of the current block in reverse based on information that indicates the type of transform analyzed.
[00221] [00221] Even more so, for example, when the information analyzed
[00222] [00222] —Adder 208 reconstructs the current block by adding prediction errors, which are inserted from the reverse transformer 206, and prediction samples, which are an input from the prediction controller 220. Adder 208 then issues the reconstructed block to the block memory 210 and loop filter 212.
[00223] [00223] Block memory 210 is a storage for storing blocks in an image to be decoded (hereinafter referred to as a current image) for reference in intraprediction.
[00224] [00224] Loop filter 212 applies a loop filter on blocks rebuilt by adder 208, and outputs the reconstructed blocks filtered to frame memory 214 and, for example, to a display device.
[00225] [00225] “When information indicating the enabling or disabling of analyzed ALF from a coded bit stream indicates enabled, a filter out of a plurality of filters is selected based on the direction and activity of local gradients, and the selected filter is applied to the reconstructed block.
[00226] [00226] Frame memory 214 is a storage for storing reference images used in interpretation, and is also referred to as a temporary frame storage. More specifically, frame memory 214 stores reconsidered blocks
[00227] [00227] Intrapredictor 216 generates a prediction signal (interpredition signal) by intrapredicting with reference to a block or blocks in the current image and stored in block memory 210. More specifically, intrapredictor 216 generates a intraprediction by interpredition with reference to samples (for example, luma and / or chroma values) from a block or blocks neighboring the current block, and then sends the intraprediction signal to the prediction controller 220.
[00228] [00228] “Note that when an intrapredicting mode in which a chroma block is intrapredictive of a luma block is selected, intrapredictor 216 can predict the chroma component of the current block based on the luma component of the current block .
[00229] [00229] Furthermore, when information indicating the application of PDPC is analyzed from an encoded bit stream (the prediction parameters emitted from entropy decoder 202, for example), intrapredictor 216 corrects post-intrapredictive pixel values with based on horizontal / vertical reference pixel gradients. (Interpreditor)
[00230] [00230] Interpreter 218 predicts the current block with reference to a reference image stored in frame memory 214. Interpretation is performed by the current block or by sub-block (for example, by 4x4 block) in the current block. For example, interpreter 218 generates an interpretation signal from the current block or sub-block based on motion compensation using motion information (for example, a motion vector) analyzed from an encoded bit stream (in the motion parameters). predictions emitted from the entropy decoder 202, for example), and emits the interpretation signal to the prediction controller 220.
[00231] [00231] “When the information analyzed from the encrypted bit stream
[00232] [00232] Furthermore, when the analyzed information of the encoded bit stream indicates application of FRUC mode, the interpreter 218 derives motion information executing motion estimate according to the pattern matching method (bilateral match or model coincidence) analyzed from the encoded bit stream. Interpreter 218 then performs motion compensation (prediction) using the derived motion information.
[00233] [00233] Furthermore, when the BIO mode must be applied, interpreter 218 derives a motion vector based on a model that assumes uniform linear motion. Also, when the analyzed information of the encoded bit stream indicates that the related motion compensation prediction mode must be applied, the interpreter 218 derives a motion vector from each sub-block based on motion vectors of neighboring blocks. (Prediction Controller)
[00234] [00234] Prediction controller 220 selects either the buzz signal or the buzz signal, and outputs the prediction signal selected to the adder 208. In general, the prediction controller 220 configuration, functions and operations , interpreter 218 and intra-predictor 216 on the decoder side can correspond to the configuration, functions and operations of the prediction controller 128, interpreter 126 and intrapredictor 124 on the encoder side. (Non-rectangular partitioning)
[00235] [00235] On prediction controller 128 coupled to intrapredictor 124 and interpreditor 126 on the encoder side (see Figure 1) as well as prediction controller 220 coupled to intrapredictor 216 and
[00236] [00236] Figure 11 is a flow chart illustrating an example of a process of dividing an image block into partitions that includes at least one first partition that has a non-rectangular shape (for example, a triangle) and a second partition, and perform additional processing that includes encoding (or decoding) the image block as a reconstructed combination of the first and second partitions.
[00237] [00237] In step S1001, an image block is divided into partitions that include a first partition that has a non-rectangular shape and a second partition, which may or may not have a non-rectangular shape. For example, as shown in Figure 12, an image block can be divided from an upper left corner of the image block to a lower right corner of the image block for creating
[00238] [00238] In step S1002, the process predicts a first motion vector for the first partition and predicts a second motion vector for the second partition. For example, the prediction of the first and second motion vectors may include selecting the first motion vector from a first set of motion vector candidates and selecting the second motion vector from a second set of motion vector candidates. movement.
[00239] [00239] In step S1003, a motion compensation process is performed to obtain the first partition using the first motion vector, which is derived in step S1002 above, and to obtain the second partition using the second motion vector , which is derived in step S1002 above.
[00240] [00240] In step S1004, a prediction process is performed for the image block as a (reconstructed) combination of the first partition and the second partition. The prediction process can include a boundary smoothing process to smooth the boundary between the first partition and the second partition. For example, the threshold smoothing process may involve weighing first predicted threshold pixel values based on the first partition and second predicted threshold pixel values based on the second partition. Several implementations of the limit smoothing process will be described below with reference to Figures 13, 14, 20 and 21A-21D.
[00241] [00241] In step S1005, the process encodes or decodes the image block using one or more parameters that include a partition parameter indicative of the division of the image block in the first partition that has a non-rectangular shape and in the second partition. As summarized in the table in Figure 15, for example, the partition parameter ("the first index value") can together encode, for example, a division direction applied to the division (for example, from top left to bottom right or from upper right to lower left as shown in Figure 12) and the first and second motion vectors derived in step S1002 above. Details of such a partition syntax operation involving the one or more parameters that include the partition parameter will be described in detail below with reference to Figures 15, 16 and 22-25.
[00242] [00242] Figure 17 is a flow chart illustrating a 2000 process of dividing an image block. In step S2001, the process divides an image into a plurality of partitions that includes a first partition that has a non-rectangular shape and a second partition, which may or may not have a non-rectangular shape. As shown in Figure 12, an image block can be divided into a first partition that has a triangle shape and a second partition also having a triangle shape. There are numerous other examples in which an image block is divided into a plurality of partitions that includes a first partition and a second partition of which at least the first partition is non-rectangular in shape. The non-rectangular shape can be a triangle, a trapezoid, and a polygon with at least five sides and angles.
[00243] [00243] For example, as shown in Figure 18, an image block can be divided into two triangular shaped partitions; an image block can be divided into more than two triangular partitions (for example, three triangular partitions); an image block can be divided into a combination of triangular-shaped partition (s) and rectangular-shaped partition (s); or an image block can be divided into a combination of triangle-shaped partition (s) and polygon-shaped partition (s).
[00244] [00244] “As additionally shown in Figure 19, an image block can be divided into an L-shaped (polygon-shaped) partition and a rectangular-shaped partition; an image block can be divided into a pentagon-shaped (polygon) partition and a triangular-shaped partition; an image block can be divided into a hexagonal partition (polygon) and a pentagonal partition (polygon); or an image block can be divided into multiple polygon-shaped partitions.
[00245] [00245] - Referring back to Figure 17, in step S2002, the process predicts a first motion vector for the first partition, for example selecting the first partition from a first set of motion vector candidates, and predicts a second vector of motion for the second partition, for example selecting the second partition from a second set of motion vector candidates. For example, the first set of motion vector candidates can include motion vectors from partitions neighboring the first partition, and the second set of motion vector candidates can include motion vectors from partitions neighboring the second partition. Neighboring partitions can be one or both of spatially neighboring partitions and temporary neighboring partitions. Some examples of spatially neighboring partitions include a partition located on the left, bottom left, bottom, bottom right, right, top right, top, or top left of the partition being processed. Examples of neighboring temporary partitions are colocalized partitions in the reference images of the image block.
[00246] [00246] In several implementations, the partitions neighboring the first partition and the partitions neighboring the second partition can be outside the image block from which the first partition and the second partition are divided. The first set of motion vector candidates can be the same as, or different from, the second set of motion vector candidates. In addition, at least one of the first set of motion vector candidates and the second set of motion vector candidates can be the same as another, third set of motion vector candidates prepared for the image block.
[00247] [00247] In some implementations, in step S2002, in response to determining that the second partition, similar to the first partition, also has a non-rectangular shape (for example, a triangle), process 2000 creates the second set of candidates motion vector (for the second partition in a non-rectangular shape) that includes motion vectors from partitions neighboring the second partition exclusive to the first partition (that is, exclusive to the motion vector of the first partition). On the other hand, in response to determining that the second partition, unlike the first partition, has a rectangular shape, the 2000 process creates the second set of motion vector candidates (for the second rectangular partition) which includes motion vectors and partitions neighboring the second partition inclusive of the first partition.
[00248] [00248] “In step S2003, the process encodes or decodes the first partition using the first motion vector derived in step S2002 above, and encodes or decodes the second partition using the second motion vector derived in step S2002 above.
[00249] [00249] “An image block division process, like the 2000 process in Figure 17, can be performed by an image encoder, as shown in Figure 1 for example, which includes a circuit
[00250] [00250] - According to another modality, as shown in Figure 1, an image encoder is provided that includes: a divider 102 which, in operation, receives and divides an original image into blocks; an adder 104 which, in operation, receives the divider blocks and predictions from a prediction controller 128, and subtracts each prediction from its corresponding block to emit a residue; a transformer 106 which, in operation, performs a transform on the residues emitted from the adder 104 to emit transform coefficients; a quantizer 108 which, in operation, quantizes the transform coefficients to generate quantized transform coefficients; an entropy encoder 110 which, in operation, encodes the quantized transform coefficients to generate a bit stream; and the prediction controller 128 coupled to an interpreter 126, an intrapredictor 124, and a memory 118, 122, in which interpreter 126, in operation, generates a prediction of a current block based on a reference block in an image of coded reference and intrapredictor 124, in operation, generates a prediction of a current block based on a reference block encoded in a current image. The prediction controller 128, in operation, divides the blocks into a plurality of partitions that includes a first partition that has a non-rectangular shape and a second partition (Figure 17, step S2001); predicts a first motion vector for the first partition and a second motion vector for the second partition (step S2002); and encodes the first partition using the first motion vector and the second partition using the second motion vector (step S2003).
[00251] [00251] According to another embodiment, an image decoder, as shown in Figure 10 for example, is provided which includes a circuit and a memory coupled to the circuit. The circuit, in operation, performs: divide an image block into a plurality of partitions that includes a first partition that has a non-rectangular shape and a second partition (Figure 17, step S2001); predict a first motion vector for the first partition and a second motion vector for the second partition (step S2002); and decode the first partition using the first motion vector and the second partition using the second motion vector (step S2003).
[00252] [00252] According to an additional embodiment, an image decoder as shown in Figure 10 is provided that includes: an entropy decoder 202 which, in operation, receives and decodes a coded bit stream to obtain coefficients quantized transformations; an inverse quantizer 204 and transformer 206 which, in operation, quantize the transform coefficients quantized to obtain transform coefficients and reverse transform the transform coefficients to obtain residues; an adder 208 which, in operation, adds the residues emitted from the inverse quantizer 204 and transformer 206 and predictions emitted from a prediction controller 220 to reconstruct the blocks; and the prediction controller 220 coupled to an interpreter 218, an intrapredictor 216, and a memory 210, 214, in which interpreter 218, in operation, generates a prediction of a current block based on a reference block in a decoded reference image and the intrapredictor
[00253] [00253] “As described in Figure 11, step S1004, according to various modalities, perform a prediction process for the image block as a (reconstructed) combination of the first partition that has a non-rectangular shape and the second partition - to involve the application of a limit smoothing process along the limit between the first partition and the second partition.
[00254] [00254] For example, Figure 21B illustrates an example of a threshold smoothing process that involves weighing first threshold pixel values, which are predicted first based on the first partition, and second threshold pixel values, which are predicted according to the second partition.
[00255] [00255] Figure 20 is a flowchart that illustrates a total limit 3000 smoothing process that involves weighing first predicted limit pixel values based on the first partition and second predicted limit pixel values based on the second partition according to one modality. In step S3001, an image block is divided into a first partition and a second partition along a boundary where at least the first partition is non-rectangular in shape, as shown in Figure 21A or in
[00256] [00256] In step S3002, first values (for example, color, luminance, transparency, etc.) of a set of pixels ("limit pixels" in Figure 21A) of the first partition along the limit are predicted first , where the first values are predicted first using information from the first partition. In step S3003, second values of the (same) set of pixels of the first partition along the limit are predicted second, in which the second values are predicted according to information from the second partition. In some implementation, at least one of the first and second predictions is an interpretation process that predicts the first values and the second values based on a reference partition in an encoded reference image. Referring to Figure 21D, in some implementations, the prediction process predicts first values for all pixels of the first partition ("the first set of samples") that include the set of pixels on which the first partition and the second partition overlap, and predict second values of only the set of pixels ("the second set of samples") over which the first and second partitions overlap. In another implementation, at least one of the first prediction and the second prediction is an intraprediction process that predicts the first values and the second values based on a reference partition encoded in a current image. In some implementations, a prediction method used in the first prediction is different from a prediction method used in the second prediction. For example, the first prediction can include an interpretation process and the second prediction can include an intraprediction process. The information used for the first prediction of the first values or for the second prediction of the second values can be movement vectors, intraprediction directions, etc. the first or second partition.
[00257] [00257] In step S3004, the first values, predicted using the first partition, and the second values, predicted using the second partition, are weighted. In step S3005, the first partition is encrypted or decoded using the first and second weighted values.
[00258] [00258] “Figure 21B illustrates an example of a limit smoothing operation in which the first partition and the second partition overlap by five pixels (at most) from each row or column. That is, the number of pixels in each row or column, for which the first values are predicted based on the first partition and the second values are predicted based on the second partition, are five at the most. Figure 21C illustrates another example of a boundary smoothing operation in which the first partition and the second partition overlap over three pixels (at most) of each row or column. That is, the number of each row's pixel set or each column, for which the first values are predicted based on the first partition and the second values are predicted based on the second partition, are three at a maximum.
[00259] [00259] Figure 13 illustrates another example of limit smoothing operation in which the first partition and the second partition overlap by four pixels (at most) of each row or each column. That is, the number of pixels in each row or column, for which the first values are predicted based on the first partition and the second values are predicted based on the second partition, are four at most. In the illustrated example, weights of 1/8, 1/4, 3/4, and 7/8 can be applied to the first values of the four pixels in the set, respectively, and weights of 7/8, 3/4, 1 / 4, and 1/8 can be applied to the second values of the four pixels in the set, respectively.
[00260] [00260] Figure 14 illustrates additional examples of a limit smoothing operation in which the first partition and the second partition overlap by zero pixels of each row or each column (that is, they do not overlap), overlap by one pixel (at most) from each row or each column, and overlap by two pixels (at most) from each row or each column, respectively. In the example where the first and second partitions do not overlap, zero weights are applied. In the example where the first and second partitions overlap one pixel in each row or each column, a weighting of 1/2 can be applied to the first pixel values in the predicted set based on the first partition, and a weighting of 1 / 2 can be applied to the second pixel values in the predicted set based on the second partition. In the example where the first and second partitions overlap over two pixels of each row or column, weights of 1/3 and 2/38 can be applied to the first values of the two pixels in the predicted set based on the first r partition, respectively, and weights of 2/3 and 1/3 can be applied to the second values of the two pixels in the predicted set based on the second partition, respectively.
[00261] [00261] According to the modalities described above, the number of pixels in the set over which the first partition and the second partition overlap is an integer. In other implementations, the number of overlapping pixels in the set can be non-integer and can be fractional, for example. Also, the weights applied to the first and second values of the pixel set can be fractional or whole depending on each application.
[00262] [00262] A limit smoothing process, like the process 3000 in Figure 20, can be performed by an image encoder, as shown in Figure 1 for example, which includes a circuit and a memory coupled to the circuit. The circuit, in operation, performs a limit smoothing operation along a limit between a first partition that has a non-rectangular shape and a second partition that are divided from an image block (Figure 20, step S3001). The limit smoothing operation includes: first prediction of first values of a set of pixels from the first partition along the limit, using information from the first partition (step S3002); second prediction of second values of the pixel set of the first partition along the limit, using information from the second partition (step S3003); weigh the first values and the second values (step S3004); and encode the first partition using the first weighted values and the second weighted values (step S3005).
[00263] [00263] According to another modality, as shown in Figure 1, an image encoder is provided that includes: a divider 102 which, in operation, receives and divides an original image into blocks; an adder 104 which, in operation, receives the divider blocks and predictions from a prediction controller 128, and subtracts each prediction from its corresponding block to emit a residue; a transformer 106 which, in operation, performs a transform on the residues emitted from the adder 104 to emit transform coefficients; a quantizer 108 which, in operation, quantizes the transform coefficients to generate quantized transform coefficients; an entropy encoder 110 which, in operation, encodes the quantized transform coefficients to generate a bit stream; and the prediction controller 128 coupled to an interpreter 126, an intrapredictor 124, and a memory 118, 122, in which interpreter 126, in operation, generates a prediction of a current block based on a reference block in an image of coded reference and intrapredictor 124, in operation, generates a prediction of a current block based on a reference block encoded in a current image. The prediction controller 128, in operation, performs a limit smoothing operation along a boundary between a first partition that has a non-rectangular shape and a second partition that are divided from an image block (Figure 20, step S3001) . The limit smoothing operation includes: first prediction of the first values of a set of pixels from the first partition along the limit, using information from the first partition (step S3002); second prediction of second values of the pixel set of the first partition along the limit, using information from the second partition (step S3003); weigh the first values and the second values (step S3004); and encode the first partition using the first weighted values and the second weighted values (step S3005).
[00264] [00264] According to another embodiment, an image decoder is provided, as shown in Figure 10 for example, which includes a circuit and a memory coupled to the circuit. The circuit, in operation, performs a limit smoothing operation along a limit between a first partition that has a non-rectangular shape and a second partition that are divided from an image block (Figure 20, step S3001). The limit smoothing operation includes: first prediction of the first values of a set of pixels from the first partition along the limit, using information from the first partition (step S3002); second prediction of second values of the pixel set of the first partition along the limit, using information from the second partition (step S3003); weight the first values and the second values (step S3004); and decode the first partition using the first weighted values and the second weighted values (step S3005).
[00265] [00265] According to another embodiment, an image decoder as shown in Figure 10 is provided that includes: an entropy decoder 202 which, in operation, receives and decodes an encoded bit stream to obtain quantized transform coefficients ; an inverse quantizer 204 and transformer 206 which, in operation, inverse quantize the transformed coefficients quantized to obtain transform coefficients and reverse transform the transform coefficients to obtain residues; an adder 208 which, in operation, adds the residues emitted from the reverse quantizer 204 and transformer 206 and predictions emitted from a prediction controller 220 to reconstruct the blocks; and the prediction controller 220 coupled to an interpreter 218, an intrapredictor 216, and a memory 210, 214, in which interpreter 218, in operation, generates a prediction of a current block based on a reference block in a decoded reference image and the intrapredictor 216, in operation, generates a prediction of a current block based on a decoded reference block in a current image. The prediction controller 220, in operation, performs a limit smoothing operation along a limit between a first partition that has a non-rectangular shape and a second partition that are divided from an image block. (Figure 20, step S3001). The limit smoothing operation includes: first prediction of the first values of a set of pixels from the first partition along the limit, using information from the first partition (step S3002); second prediction of second values of the pixel set of the first partition along the limit, using information from the second partition (step S3003); weigh the first values and the second values (step S3004); and decode the first partition using the first weighted values and the second weighted values (step S3005).
[00266] [00266] “As described in Figure 11, step S1005, according to several modalities, the image block divided into a first part
[00267] [00267] Figure 15 is a table of sample partition parameters ("the first index value") and sets of information together encoded by the partition parameters, respectively. The partition parameters ("the first index values") range from 0 to 6 and together they code: the direction of dividing an image block into a first partition and a second partition both of which are triangle (see Figure 12), the first predicted motion vector for the first partition (Figure 11, step S1002), and the second predicted motion vector for the second partition (Figure 11, step S1002). Specifically, the partition parameter What encodes the direction of direction is from top left to bottom right, the first motion vector is the "2nd" motion vector listed in the first set of vector candidates of motion for the first partition, and the second motion vector is the "1st" motion vector listed in the second set of motion vector candidates for the second partition.
[00268] [00268] Partition parameter 1 that encodes the direction of division is from upper right to lower left, the first motion vector is the "1st" motion vector listed in the first set of vector vector candidates. motion for the first partition, and the second motion vector is the "2nd" motion vector listed in the second set of motion vector candidates for the second partition.
[00269] [00269] Figure 22 is a flow chart illustrating a method 4000 performed on the encoder side. In step S4001, the process divides an image block into a plurality of partitions that includes a first partition that has a non-rectangular shape and a second partition, based on a partition parameter indicative of the division. For example, as shown in Figure 15 above, the partition parameter can indicate the direction of division of an image block (for example, from upper right to lower left or upper left to bottom right corner). In step S4002, the process encodes the first partition and the second partition. In step S4003, the process writes one or more parameters that include the partition parameter in a bit stream, which the decoder side can receive and decode to obtain the one or more parameters to perform the same process. prediction (as performed on the encoder side) for the first and second partitions on the decoder side. The one or more parameters that include the partition parameter can together or separately encode various pieces of information such as the non-rectangular shape of the first partition, the shape of the second partition, the direction of division used to divide an image block for obtain the first and second partitions, the first motion vector of the first partition, the second motion vector of the second partition, etc.
[00270] [00270] Figure 23 is a flow chart illustrating a 5000 method performed on the decoder side. In step S5001, the process analyzes one or more parameters of a bit stream, where the one or more parameters include a partition parameter indicative of dividing an image block into a plurality of partitions that includes a first partition that has a non-rectangular shape and a second partition. The one or more parameters that include the analyzed bitstream partition parameter can together or separately encode several pieces of information needed by the decoder side to perform the same prediction process as performed on the encoder side, such as such as the non-rectangular shape of the first partition, the shape of the second partition, the direction of division used to split an image block to obtain the first and second partitions, the first motion vector of the first partition, the second motion vector of the second partition, etc. In step S5002, process 5000 divides the image block into the plurality of partitions based on the analyzed bitstream partition parameter. In step S5003, the process decodes the first partition and the second partition, as divided from the image block.
[00271] [00271] Figure 24 is a table of sample partition parameters ("the first index value") and sets of information together encoded by the partition parameters, respectively, similar in nature to the sample table described above in Figure 15. In Figure 24, the partition parameters ("the first index values") range from 0 to 6 and together encode: the shape of the first and second divided partitions of an image block, the direction of division of an image block in the first and second partitions, the first predicted motion vector for the first partition (Figure 11, step S1002), and the second predicted motion vector for the second partition (Figure 11, step S1002). Specifically, the partition parameter O encodes that none of the first and second partitions have a triangular shape, so the split direction information is "N / A", the first motion vector information is "N / A" A ", and the second motion vector information is" N / A ".
[00272] [00272] Partition parameter 1 encodes the first and second partitions are triangles, the direction of division is from the upper left corner to the lower right corner, the first motion vector is the "2nd" motion vector listed in the first set of motion vector candidates for the first partition, and the second motion vector is the "1st" motion vector listed in the second set of motion vector candidates for the second partition.
[00273] [00273] According to some implementations, the partition parameters (index values) can be binarized according to a binarization scheme, which is selected depending on a value of at least one or one or more parameters. Figure 16 illustrates a sample binarization scheme to binarize the index values (the partition parameter values).
[00274] [00274] Figure 25 is a table of sample combinations of a first parameter and a second parameter, one of which is a partition parameter indicative of dividing an image block into a plurality of partitions that includes a first pair - tition that has a non-rectangular shape and a second partition. In this example, the partition parameter can be used to indicate the division of an image block without encoding other information together, which are encoded by one or more of the other parameters.
[00275] [00275] In the first example in Figure 25, the first parameter is used to indicate an image block size, and the second parameter is used as the partition parameter (a flag) to indicate that at least one of the plurality of partitions is divided of an image block has a triangular shape. Such a combination of the first and second parameters can be used to indicate, for example, 1) when the image block size is larger than 64x64, there is no triangular partition, or 2) when the width and height ratio of a image block is larger than 4 (for example, 64x4), there is no triangular partition.
[00276] [00276] In the second example of Figure 25, the first parameter is used to indicate a prediction mode, and the second parameter is used as the partition parameter (a flag) to indicate that at least one of the plurality of partitions divided by one image block has a triangular shape. Such a combination of the first and second parameters can be used to indicate, for example, 1) when an image block is encoded in intramode, there is no triangular partition.
[00277] [00277] In the third example in Figure 25, the first parameter is used as the partition parameter (a flag) to indicate that at least one of the plurality of divided partitions of an image block has a triangular shape, and the second parameter is used to indicate a prediction mode. Such a combination of the first and second parameters can be used to indicate, for example, 1) when at least one of the plurality of divided partitions of an image block has a triangular shape, the image block must be intercoded.
[00278] [00278] In the fourth example of Figure 25, the first parameter indicates that the motion vector of a neighboring block, and the second parameter is used as the partition parameter which indicates the direction of division of a block of image in two triangles. Such a combination of the first and second parameters can be used to indicate, for example, 1) when the motion vector of a neighboring block is in a diagonal direction, the direction of division of the image block into two triangles is from the upper left corner to bottom right corner.
[00279] [00279] - In the fifth example in Figure 25, the first parameter indicates the direction of intraprediction of a neighboring block, and the second parameter is used as the partition parameter which indicates the direction of division of a block of image in two triangles. Such combination of the first and second parameters can be used to indicate, for example, 1) when the intraprediction direction of a neighboring block is an inverse diagonal direction, the direction of division of the image block into two triangles is from the upper right corner to lower left corner.
[00280] [00280] It should be understood that tables of one or more parameters that include the partition parameter and which information is together or separately encoded, as shown in Figures 15, 24, and 25, are presented as examples only and numerous other ways of encoding, together or separately, various information as part of the partition syntax operation described above are within the scope of the present description. For example, the partition parameter may indicate that the first partition is a triangle, a trapezoid, or a polygon with at least five sides and angles. The partition parameter can indicate that the second partition has a non-rectangular shape, such as a triangle, a trapezoid, and a polygon with at least five sides and angles. The partition parameter can indicate one or more pieces of information about the division, such as the non-rectangular shape of the first partition, the shape of the second partition (which can be non-rectangular or rectangular), the direction of division applied to divide a image block in a plurality of partitions (for example, from an upper left corner of the image block to its lower right corner, and from an upper right corner of the image block to its lower left corner). The partition parameter can together encode additional information such as the first motion vector of the first partition, the second motion vector of the second partition, image block size, prediction mode, the motion vector of a block neighbor, the intraprediction direction of a neighboring block, etc. Alternatively, any of the additional information can be separately encoded by one or more parameters other than the partition parameter.
[00281] [00281] A partition syntax operation, like process 4000 in Figure 22, can be performed by an image encoder, as shown in Figure 1 for example, which includes a circuit and a memory coupled to the circuit. The circuit, in operation, performs a partition syntax operation that includes: dividing an image block into a plurality of partitions that includes a first partition that has a non-rectangular shape and a second partition based on a partition parameter division code (Figure 22, step S4001); encode the first partition and the second partition (S4002); and write one or more parameters that include the partition parameter in a bit stream (S4003).
[00282] [00282] “According to another modality, as shown in Figure 1, an image encoder is provided that includes: a divider 102 which, in operation, receives and divides an original image into blocks; an adder 104 which, in operation, receives the divider blocks and predictions from a prediction controller 128, and subtracts each prediction from its corresponding block to emit a residue; a transformer 106 which, in operation, performs a transform on the residues emitted from the adder 104 to emit transform coefficients; a quantizer 108 which, in operation, quantizes the transform coefficients to generate quantized transform coefficients; an entropy encoder 110 which, in operation, encodes the quantized transform coefficients to generate a bit stream;
[00283] [00283] According to another embodiment, an image decoder is provided, as shown in Figure 10 for example, which includes a circuit and a memory coupled to the circuit. The circuit, in operation, performs a partition syntax operation that includes: analyzing one or more parameters of a bit stream, in which the one or more parameters include a partition parameter indicating the division of an image block into a plurality of partitions that include a first partition that has a non-rectangular shape and a second partition (Figure 23, step S5001); divide the image block into the plurality of partitions based on the partition parameter (S5002); and decode the first partition and the second partition (S5003).
[00284] [00284] According to an additional embodiment, an image decoder as shown in Figure 10 is provided that includes: an entropy decoder 202 which, in operation, receives and decodes a coded bit stream to obtain coefficients of trans-
[00285] [00285] According to other examples, an interpreter can perform the following process.
[00286] [00286] All motion vector candidates included in the first set of motion vector candidates can be unipredict motion vectors. That is, the interpreter can determine only unipredition motion vectors as motion vector candidates in the first set of motion vector candidates.
[00287] [00287] The interpreter can only select unipredition motion vector candidates from the first set of motion vector candidates.
[00288] [00288] “Only one unipredictive motion vector can be used to predict a small block. The bipredictive motion vector can be used to predict a large block. As an example, the prediction process may include judging an image block size. When the image block size is judged to be greater than a threshold, the prediction may include selecting the first motion vector from a first set of motion vector candidates, and the first set of motion vector candidates ment may contain unipredictive and / or bipredictable motion vectors. When the image block size is judged to be no larger than the limit, the prediction may include selecting the first motion vector from a first set of motion vector candidates, and the first set of motion vector candidates it may contain only unipredictive motion vectors. (Implementations and Applications)
[00289] [00289] As described in each of the above modalities, each functional or operational block can typically be realized as an MPU (microprocessing unit) and a memory, for example. Furthermore, the processes performed by each of the function blocks can be performed as a program execution unit, such as a processor which reads and executes a software (a program) recorded on a recording medium such as a ROM. The software can be distributed. The software can be recorded on a variety of recording media such as semiconductor memory. Note that each functional block can also be made as hardware (dedicated circuit).
[00290] [00290] The processing described in each of the modalities can be performed through integrated processing using a single device (system) and, alternatively, it can be performed through decentralized processing using a plurality of devices. Furthermore, the processor that executes the program described above can be a single processor or a plurality of processors. In other words, integrated processing can be performed, and, alternatively, decentralized processing can be performed.
[00291] [00291] The modalities of this description are not limited to the exemplary modalities above; various modifications can be made to the exemplary modalities, the results of which are also included within the scope of the modalities of this description.
[00292] [00292] In the following, examples of application of the mobile image encoding method (image encoding method) and the mobile image decoding method (image decoding method) described in each of the above modalities will be described, as well as various systems that implement the application examples. Such a system can be characterized as including an image encoder that employs the image encoding method, an image decoder that employs the image decoding method, or an encoder - image decoder that includes both the image encoder as well as the image decoder. Other configurations of such a system can be modified on a case-by-case basis (Usage Examples)
[00293] [00293] Figure 26 illustrates a general configuration of the ex100 content provisioning system suitable for implementing a content distribution service. The area in which the communication service is provided is divided into cells of desired sizes, and
[00294] [00294] In the content provisioning system ex100, devices including an ex111 computer, ex112 gaming device, camera ex113, household appliance ex114, and smartphone ex115 are connected to the internet ex101 through the internet service provider ex102 or communications network ex104 and base stations ex106 through ex110. The ex100 content delivery system can combine and connect any combination of the above devices. In various implementations, devices can be directly or indirectly connected together via a telephone network or near-field communication instead of through base stations ex106 through ex110. In addition, a stream server ex103 can be connected to devices that include the computer ex111, gaming device ex112, camera ex113, household appliance ex114, and smartphone ex115 via, for example, internet ex101. The flow server ex103 can also be connected to, for example, a terminal in a hotspot on an ex117 aircraft via an ex116 satellite.
[00295] [00295] “Note that instead of base stations ex106 through ex110, wireless access points or hotspots can be used. The flow server ex103 can be connected to the communications network ex104 directly instead of via internet ex101 or internet service provider ex102, and can be connected to the plane ex117 directly instead of via satellite ex116.
[00296] [00296] The camera ex113 is a device capable of capturing still images and video, just like a digital camera. The ex115 smartphone is a smartphone, cell phone, or personal portable handset system (PHS) phone that can operate under the mobile communications system standards of typical 2G systems,
[00297] [00297] The household appliance ex118 is, for example, a refrigerator or a device included in a domestic fuel cell cogeneration system.
[00298] [00298] In the content provisioning system ex100, a terminal that includes an image and / or video capture function is capable of, for example, live streaming connecting to an ex103 streaming server through, for example, base ex106. When streaming live, a terminal (eg computer ex111, game device ex112, camera ex113, household appliance ex114, smartphone ex115, or airplane ex117) can perform the coding processing described in the above modalities on a still image or video content captured by a user through the terminal, can multiplex the video data obtained through encoding and audio data obtained by audio encoding that corresponds to the video, and can transmit the data obtained to the stream server ex103 . In other words, the terminal functions as the image encoder in accordance with an aspect of the present description.
[00299] [00299] The flow server ex103 flows content data transmitted to clients requesting the flow. Examples of customers include computer ex111, gaming device ex112, camera ex113, household appliance ex114, smartphone ex115 and terminals inside the plane ex117 which are capable of decoding the encoded data described above. The devices that receive the data in stream decode and reproduce the received data. In other words, the device can each function as the image decoder according to an aspect of the present description. (Decentralized Processing)
[00300] [00300] The flow server ex103 can be performed as a plurality of servers or computers among which tasks such as processing, recording and data flow are divided. For example, the flow server ex103 can be realized as a content delivery network (CDN) that flows content through a network that connects multiple edge servers located across the world. In a CDN, an edge server physically close to the client is dynamically assigned to the client. Content is cached and streamed to the edge server to reduce load times. In the case of, for example, some kind of error or a change in connectivity due to, for example, a peak in traffic, it is possible to stream data steadily at high speeds as it is possible to avoid the affected parts network, for example, splitting processing between a plurality of edge servers or switching the flow tasks to a different edge server, and continuing the flow.
[00301] [00301] Decentralization is not limited to just the division of processing to flow; the encoding of the captured data can be divided between and performed by the terminals, on the server side, or both. In one example, in typical coding, processing is performed in two loops. The first loop is to detect how complicated the image is on a frame-by-frame or scene-by-scene basis, or by detecting the encoding load. The second loop is for processing that maintains the image quality and improves the coding efficiency. For example, it is possible to reduce the processing load of the terminals and improve the quality and efficiency of content encoding by making the terminals execute the first encoding loop and having the server side that received the content executing the second encoding loop. In such a case, upon receipt of a decryption request, it is possible that the encoded data resulting from the first loop executed by a terminal will be received and reproduced on another terminal in approximately
[00302] [00302] In another example, the camera ex113 or similar extracts a quantity of characteristics from an image, compresses the data related to the quantity of characteristics such as metadata and transmits the compressed metadata to a server. For example, the server determines the significance of an object based on the number of characteristics and changes the quantization accuracy accordingly to perform compression appropriate to the meaning (or significance of content) of the image. Characteristic quantity data is specifically effective in improving the accuracy and efficiency of motion vector prediction during the second compression step performed by the server. Furthermore, a coding that has a relatively low processing load, such as variable length coding (VLC), can be handled by the terminal and coding that has a relatively high processing load, such as adaptive binary arithmetic coding in context (CABAC) can be manipulated by the server.
[00303] [00303] In yet another example, there are cases in which a plurality of videos of approximately the same scene is captured by a plurality of terminals in, for example, a stadium, shopping center or factory. In such a case, for example, encoding can be decentralized by dividing processing tasks between the plethora of terminals that captured the videos and, if necessary, other terminals that did not capture the videos and the server, in a database. if per unit. The units can be, for example, groups of images (GOP), images or blocks that result from the division of an image. This makes it possible to reduce load times and achieve a flow that is closer to real time.
[00304] [00304] Since the videos are about the same scene,
[00305] [00305] Furthermore, the server can stream video data after performing transcoding to convert the encoding format of the video data. For example, the server can convert the encoding format from MPEG to VP (for example, VP9), and can convert H.264 to H.265.
[00306] [00306] “In this mode, encryption can be performed by a terminal or one or more servers. Consequently, although the device that performs the encoding is referred to as a "server" or "terminal" in the following description, some or all of the processes executed by the server can be executed by the terminal, and likewise some or all of the processes executed by the terminal can be executed by the server. This also applies to decoding processes. (3D, Multiple Angles)
[00307] [00307] There has been an increase in the use of combined images or videos of images or videos from different concurrently captured scenes, or from the same scene captured from different angles, through a plurality of terminals such as the camera ex113 and / or smartphone ex115. The videos captured by the terminals are combined based on, for example, the relative positional relationship separately obtained between the terminals, or regions in a video that have points of matching characteristics.
[00308] [00308] In addition to encoding two-dimensional moving images, the server can encode a still image based on scene analysis of a moving image, either automatically or at a point in time specified by the user, and transmit the encoded still image to a receiving terminal. Furthermore, when the server can obtain the relative positional relationship between the video capture terminals, in addition to two-dimensional moving images, the server can generate a three-dimensional geometry of a scene based on video of the same scene captured from different angles. Note that the server can separately encode three-dimensional data generated from, for example, a point cloud, and can, based on a result of recognizing or tracking a person or object using three-dimensional data, can select or reconstruct and generate a video to be transmitted to a video receiving terminal captured by a plurality of terminals.
[00309] [00309] This allows the user to freely enjoy a scene by selecting videos that correspond to the video capture terminals, and allows the user to enjoy the content obtained by extracting a video at a selected point of view from reconstructed three-dimensional data from a plurality of images or videos. Furthermore, as with video, sound can be recorded from relatively different angles, and the server can multiplex audio from a specific angle or space with the corresponding video, and transmit multiplexed video and audio.
[00310] [00310] In recent years, content that is a composite of the real world and a virtual world, such as virtual reality (VR) and augmented reality (AR) content, has also become popular. In the case of VR images, the server can create images from the views of both the left and right eyes and perform an encoding that tolerates reference between the images from two views, such as multi-view encoding (MVC), and, alternatively, you can encode the images as separate streams without referencing. When the images are decoded as separate streams, the streams can be synchronized when reproduced in order to recreate a virtual three-dimensional space according to the user's point of view.
[00311] [00311] In the case of AR images, the server overlays existing virtual object information on a virtual space over camera information that represents a real world space based on a three-dimensional position or movement from the user's perspective. River. The decoder can obtain or store virtual object information and three-dimensional data, generate two-dimensional images with moving base from the user's perspective and then generate superimposed data seamlessly connecting the images. Alternatively, the decoder can transmit the movement from the user's perspective to the server in addition to a request for information about the virtual object, and the server can generate overlapping data based on three-dimensional data stored on the server according to the model. received, and encode and stream the overlapping data generated to the decoder. Note that the superimposed data includes, in addition to RGB values, a value a which indicates transparency, and the server sets the value a for sections other than the object generated from the three-dimensional data to, for example, 0, and can execute the encoding - cation while these sections are transparent. Alternatively, the server can adjust the background to a predetermined RGB value, such as a chroma key, and generate data in which areas other than the object are adjusted as the background.
[00312] [00312] Decryption of similarly streamed data can be performed by the client (ie, the terminals), on the server side, or divided between them. In one example, a terminal can transmit a reception request to a server, the requested content can be received and decoded by another terminal, and a decoded signal can be transmitted to a device that has a display. It is possible to reproduce high image quality data by decentralizing the processing and appropriately selecting the content regardless of the processing capacity of the communications terminal itself. In yet another example, while a TV, for example, is receiving image data that is large in size, a region of an image, such as a block obtained by dividing the image, can be decoded and displayed on a personal terminal or terminals of a viewer or TV viewers. This makes it possible for viewers to share a large image view as well as for each viewer to check their designated area or inspect a region in additional detail closer together.
[00313] [00313] In situations in which a plurality of wireless connections is possible over close, medium and long distances, internal or external, it may be possible to receive content continuously using a flow system standard such as MPEG-DASH. The user can switch between data in real time while freely selecting a decoder or display device that includes the user's terminal, displays arranged indoors or outdoors, etc. Furthermore, using, for example, information about the user's position, decoding can be performed while switching which terminal handles decoding and which terminal handles content display. This makes it possible to map and display information, while the user is moving on a route to a destination, on the wall of a nearby building in which a device capable of displaying content is embedded, or on part of the ground. Furthermore, it is also possible to switch the bit rate of the received data based on accessibility for the encrypted data over a network, such as when encrypted data is cached on a server quickly accessible from the receiving terminal or when encrypted data is copied to an edge server in a content delivery service. (Scalable Coding)
[00314] [00314] Content switching will be described with reference to a scalable flow, illustrated in Figure 27, which is encoded by compression by implementing the mobile image encoding method described in the above modalities. The server can have a configuration in which the content is switched while making use of the time and / or spatial scalability of a flow, which is achieved by dividing into and encoding layers, as illustrated in Figure 27. Note that there may be a plurality of individual flows that are of the same content but different quality. In other words, by determining which layer to decode based on internal factors, such as processing capacity on the decoder side and external factors, such as communication bandwidth, the decoder side can freely switch between content low resolution and high resolution content while decoding. For example, in a case where the user wants to continue watching, for example at home on a device such as a TV connected to the internet, a video that the user was previously watching on the smartphone ex115 while on the move, the device can simply decode the same stream to a different layer, which reduces the server side load.
[00315] [00315] Furthermore, in addition to the configuration described above in which scalability is achieved as a result of images being encoded by layer, with the enhancement layer being above the base layer, the enhancement layer can include metadata based on, for example, statistical information about the image. The decoder side can generate high-quality image content by forming super-resolution images on a base layer image based on the metadata. The formation of super-resolution images can improve the SN ratio while maintaining the resolution and / or increasing the resolution. Metadata includes information to identify a linear or non-linear filter coefficient as used in super-resolution processing, or information that identifies a parameter value in filter processing, machine learning, or a method of minimums squares used in super-resolution processing.
[00316] [00316] Alternatively, a configuration can be provided in which an image is divided into, for example, blocks according to, for example, the meaning of an object in the image. On the decoder side, only a partial region is decoded by selecting a block to decode. Also, by storing an attribute of the object (person, car, ball, etc.) and a position of the object in the video (coordinated in identical images) as metadata, the decoder side can identify the position of a desired object based on in the metadata and determine which block or blocks include that object. For example, as illustrated in Figure 28, metadata can be stored using a data storage structure other than pixel data, such as a SEI message (supplemental improvement information) in HEVC. This metadata indicates, for example, the position, size, or color of the main object.
[00317] [00317] Metadata can be stored in units of a plurality of images, such as flow, sequence, or random access units. The decoder side can obtain, for example, the time in which a specific person appears in the video, and by adjusting the time information with unit information of the image, it can identify an image in which the object is present , and can determine the position of the object in the image. (Webpage optimization)
[00318] [00318] Figure 29 illustrates an example of a display screen for a webpage on the computer ex111, for example. Figure 30 illustrates an example of a webpage display screen on the ex115 smartphone, for example. As illustrated in Figure 29 and Figure 30, a webpage can include a plurality of image connections that are connections to image content, and the appearance of the webpage differs depending on the device used to view the webpage. When a plurality of image connections are visible on the screen, until the user explicitly selects an image connection, or until the image connection is in the approximate center of the screen or the entire image connection fits on the screen, the display device (decoder) can display, as the image connections, still images included in the image content |; you can display a video as an animated gif using a plurality of still images or images |; or you can receive only the base layer and decode and display the video.
[00319] [00319] “When an image connection is selected by the user, the display device performs decoding while giving the highest priority to the base layer. Note that if there is information in the HTML code of the webpage that indicates that the content is scalable, the display device can decode even the enhancement layer. Also, in order to guarantee real-time reproduction, before a selection is made or when the bandwidth is severely limited, the display device can reduce the delay between the point in time at which the main image is decoded and the point in time at which the decoded image is displayed (that is, the delay between the start of decoding the content until the display of the content), decoding and displaying only advanced reference images (image |, image P, image B advanced reference). Furthermore, the display device can purposely ignore the reference relationship between images and roughly decode all B and P images as advanced reference images, and then perform normal decoding according to the number of images received over time. increases. (Autonomous Direction)
[00320] [00320] “When transmitting and receiving data from still images or video, such as bi or three-dimensional map information for autonomous driving or assisted steering in an automobile, the receiving terminal can receive, in addition to image data that belong to one or more layers, information about, for example, the weather or road construction as metadata, and associate the metadata with the image data when decoding. Note that metadata can be called a layer and, alternatively, can simply be multiplexed with the image data.
[00321] [00321] In such a case, as the car, drone, plane, etc., which contains the receiving terminal is mobile, the receiving terminal can uninterruptedly receive and perform decoding while switching between base stations between stations base ex106 to ex110 transmitting information indicating the position of the receiving terminal. Furthermore, according to the selection made by the user, the user's situation, and / or the bandwidth of the connection, the receiving terminal can dynamically select the degree to which metadata is received or the degree to which map information, for example, is updated.
[00322] [00322] In the ex100 content provisioning system, the customer can receive, decode and reproduce, in real time, encoded information transmitted by the user. (Individual Content Stream)
[00323] [00323] In the ex100 content provisioning system, in addition to high image quality, long content distributed by a video distribution entity, a low image quality unidifusion or multicast stream, short content from an individual are also possible. Such content from individuals is likely to further increase in popularity. The server can first perform editing processing on the content before encoding processing in order to refine the individual content. This can be accomplished using the following configuration, for example.
[00324] [00324] In real time while capturing video or image content or after the content has been captured and accumulated, the server performs recognition processing based on raw data or encoded data, such as processing capture errors , scene research processing, meaning analysis, and / or object detection processing. Then, based on the result of the recognition processing, the server - either when prompted or automatically - edits the content, examples of which include: correction such as focus and / or motion blur correction; removing low priority scenes such as scenes that are low in brightness compared to other images or out of focus; object edge adjustment; and adjusting the color tone. The server encodes the edited data based on the result of the edit. It is known that excessively long videos tend to receive fewer views. Consequently, in order to keep the content within a specific length that scales with the length of the original video, the server can, in addition to the low priority scenes described above, automatically cut scenes with low movement based on a result. - image processing. Alternatively, the server can generate and encode a video summary based on a result of an analysis of the meaning of a scene.
[00325] [00325] “There may be cases in which the individual content may include content that infringes copyright, moral right, portrait rights, etc. Such a case can lead to an unfavorable situation for the creator, such as when the content is shared beyond the scope intended by the creator. Consequently, before encoding, the server can, for example, edit images in order to blur faces of people on the periphery of the screen or blur the interior of a house, for example. In addition, the server can be configured to recognize the faces of people other than a person registered in images to be encoded, and when those faces appear in an image, you can apply a mosaic filter, for example, on the face of the image. person. Alternatively, as pre- or post-processing for encoding, the user can specify, for copyright reasons, a region of an image that includes a person or a background region to be processed, and the server can process the specified region. - each, for example, replacing the region with a different image, or blurring the region. If the region includes a person, the person can be tracked on the moving image, and the person's head region can be replaced with another image as the person moves.
[00326] [00326] As there is a demand for real-time viewing of content produced by individuals, which tends to be small in data size, the decoder first receives the base layer as the highest priority and performs decoding and reproduction, although this may differ depending on bandwidth. When the content is played back two or more times, such as when the decoder receives the enhancement layer during decoding and playback of the base layer and loops the playback, the decoder can play a high-quality video , including the enhancement layer. If the stream is encoded using such a scalable encoding, the video can be low
[00327] [00327] The encoding and decoding can be performed by LSI (large scale integration circuit) ex500 (see Figure 26), which is typically included in each terminal. The LSI ex500 can be configured on a single chip or a plurality of chips. A software for encoding and decoding mobile images can be integrated into some type of recording medium (such as a CD-ROM, a floppy disk or a hard disk) that is readable by, for example, a computer. ex111 and encoding and decoding can be performed using the software. Furthermore, when the smartphone ex115 is equipped with a camera, the video data obtained by the camera can be transmitted. In this case, the video data is encoded by the LSI ex500 included in the smartphone ex115.
[00328] [00328] “Note that the LSI ex500 can be configured to download and activate an application. In such a case, the terminal first determines whether it is compatible with the scheme used to encode the content or whether it is capable of performing a specific service. When the terminal is not compatible with the content encoding scheme or when the terminal is unable to perform a specific service, the terminal first downloads a codec or application software then obtains and reproduces the content.
[00329] [00329] In addition to the example of content provision system ex100 that uses internet ex101, at least the mobile image encoder (image encoder) or the mobile image decoder (image decoder) described in the above modalities, can be implemented in a digital transmission system. The same encoding processing and decoding processing can be applied to transmit and receive broadcast radio waves overlaid with multiplexed audio and video data using, for example, a satellite, although this is targeted for multicast whereas unidiffusion is more easy with the ex100 content delivery system. (Hardware configuration)
[00330] [00330] Figure 31 illustrates additional details of the ex115 smartphone shown in Figure 26. Figure 32 illustrates an example configuration of the ex115 smartphone. The ex115 smartphone includes an ex450 antenna for transmitting and receiving radio waves to and from the ex110 base station, an ex465 camera capable of capturing video and still images, and an ex458 display that displays decoded data, such as video captured by ex465 camera and video received by the ex450 antenna. The ex115 smartphone even includes an ex466 user interface, such as a touch panel, ex457 audio output unit such as a speaker to output voice or other audio, an ex456 audio input unit such as a microphone for input audio, an ex467 memory capable of storing decoded data such as captured video or mobile images, recorded audio, received video or still images, mail, as well as decoded data, and an ex464 slot which is an interface for SIM ex468 to authorize access to a network and various data. Note that an external memory can be used instead of the ex467 memory.
[00331] [00331] The main controller ex460 which comprehensively controls the ex458 display and ex466 user interface, ex461 power supply circuit, ex462 user interface input controller, ex455 video signal processor, ex463 camera interface , ex459 display controller, ex452 modulator / demodulator, ex453 multiplexer / demultiplexer, ex454 audio signal processor, ex464 slot and ex467 memory are connected via an ex470 bus.
[00332] [00332] When the user turns on the power button of the ex461 power supply circuit, the ex115 smartphone is turned on in an operable state, and each component is supplied with battery power.
[00333] [00333] The ex115 smartphone performs processing for, for example, data call and transmission, based on control performed by the main controller ex460, which includes a CPU, ROM, and RAM. When making calls, an audio signal recorded by an ex456 audio input unit is converted to a digital audio signal by the ex454 audio signal processor, to which expanded spectrum processing is applied by the ex452 modulator / demodulator and digital conversion -analogue, and frequency conversion processing is applied by the ex451 transmitter / receiver, and the resulting signal is transmitted through the ex450 antenna. The received data is amplified, converted to frequency, and converted to analog-digital, expanded reverse spectrum processed by the ex452 modulator / demodulator, converted to an analog audio signal by the ex454 audio signal processor, and then output from the output unit audio ex457. In data transmission mode, text, still image, or video data are transmitted by the main controller ex460 via the user interface input controller ex462 based on the operation of the main body ex466 user interface, for example. example. Similar transmit and receive processing is performed. In data transmission mode, when sending a video, still image or video and audio, the ex455 video signal processor encodes by compression, using the mobile image encoding method described in the above modalities, a video signal stored in memory ex467 or a video signal inserted from the camera ex465, and transmits the encoded video data to the ex453 multiplexer / demultiplexer. The ex454 audio signal processor encodes an audio signal recorded by an ex456 audio input unit while the ex465 camera is capturing a video or still image, and transmits the encoded audio data to the ex453 multiplexer / demultiplexer. The ex453 multiplexer / demultiplexer multiplexes the encoded video data and encoded audio data using a predetermined scheme, modulates and converts the data using the ex452 modulator / demodulator (modulator / demodulator circuit) and the transmitter / receiver ex451, and transmits the result via the ex450 antenna.
[00334] [00334] “When a video attached to an email or a chat or a connected video from a webpage, it is received, for example, in order to decode the multiplexed data received through an ex450 antenna, the multiplexer / demultiplexer ex453 demultiplexes the multiplexed data to divide the multiplexed data into a bit stream of video data and a bit stream of audio data, supplies the encoded video data to the ex455 video signal processor via the ex470 synchronous bus, and supplies the encoded audio data to the ex454 audio signal processor via the ex470 synchronous bus. The video signal processor ex455 decodes the video signal using a mobile image decoding method that corresponds to the mobile image encoding method described in the above modalities, and video or a still image included in the connected mobile image file is displayed on the ex458 display via the ex459 display controller. The ex454 audio signal processor decodes the audio signal and outputs audio from the ex457 audio output unit. As the flow in real time is becoming
[00335] [00335] Although the ex115 smartphone was used in the example above, three other implementations are conceivable: a transceiver terminal that includes both an encoder and a decoder; a transmitter terminal that includes only one encoder; and a receiver terminal that includes only a decoder. In the description of the digital transmission system, an example is given in which multiplexed data obtained as a result of video data and multiplexed with audio data is received or transmitted. Multiplexed data, however, can be multiplexed video data with data other than audio data, such as text data relating to the video. In addition, the video data itself instead of multiplexed data can be received or transmitted.
[00336] [00336] Although the main controller ex460, which includes a CPU, is described as controlling the encoding or decoding processes, several terminals often include GPUs. Consequently, a configuration is acceptable in which a large area is processed at once using the performance capacity of the GPU through memory shared by the CPU and GPU or memory that includes an address that is managed to allow for a common usage by CPU and GPU. This makes it possible to shorten the encoding time, maintain the real-time nature of the stream, and reduce the delay. Specifically, processing relating to motion estimation, unlock filtering, adaptive sample displacement (SAO), and transformation / quantization can be effectively performed by the GPU instead of the CPU in units of images, for example, all at once.
权利要求:
Claims (35)
[1]
1. Image encoder characterized by the fact that it comprises: a circuit; and a memory coupled to the circuit; in which the circuit, in operation, performs: dividing an image block into a plurality of partitions that includes a first partition that has a non-rectangular shape and a second partition; predict a first motion vector for the first partition and a second motion vector for the second partition; and encode the first partition using the first motion vector and the second partition using the second motion vector.
[2]
2. Encoder according to claim 1, characterized by the fact that the second partition has a non-rectangular shape.
[3]
3. Encoder according to claim 1, characterized by the fact that the non-rectangular shape is a triangle.
[4]
4. Encoder according to claim 1, characterized by the fact that the non-rectangular shape is selected from a group consisting of a triangle, a trapezoid, and a polygon with at least five sides and angles.
[5]
5. Encoder according to claim 1, characterized by the fact that the prediction includes selecting the first motion vector from a first set of motion vector candidates and selecting the second motion vector from a second set of motion motion vector candidates.
[6]
6. Encoder according to claim 5, characterized by the fact that the first set of motion vector candidates includes motion vectors from partitions neighboring the first partition, and the second set of motion vector candidates includes motion vectors from partitions neighboring the second partition.
[7]
7. Encoder according to claim 6, characterized by the fact that the partitions neighboring the first partition and the partitions neighboring the second partition are outside the image block from which the first partition and the second partition are divided.
[8]
8. Encoder according to claim 6, characterized by the fact that the neighboring partitions are one or both of spatially neighboring partitions and temporary neighboring partitions.
[9]
9. Encoder according to claim 5, characterized by the fact that the first set of motion vector candidates is the same as the second set of motion vector candidates.
[10]
10. Encoder according to claim 5, characterized by the fact that at least one of the first set of motion vector candidates and the second set of motion vector candidates is the same as a third set of candida - motion vector tos for the image block.
[11]
11. Encoder according to claim 5, characterized by the fact that the circuit, in response to determining that the second partition has a non-rectangular shape, creates the second set of motion vector candidates that includes motion vectors of partitions neighboring the second partition exclusive to the first partition; and in response to determining that the second partition has a rectangular shape, create the second set of motion vector candidates that includes motion vectors from partitions neighboring the second partition inclusive of the first partition.
[12]
12. Encoder according to claim 1, characterized by the fact that the prediction includes, selecting a first motion vector candidate from a first set of motion vector candidates and deriving the first motion vector adding a first difference of motion vector to the first motion vector candidate, and select a second motion vector candidate from a second set of motion vector candidates and derive the second motion vector by adding a second motion vector difference to the second motion vector candidate motion vector.
[13]
13. Image encoder characterized by the fact that it comprises: a divider which, in operation, receives and divides an original image into blocks, an adder which, in operation, receives the display blocks and predictions of a prediction controller, and subtracts each prediction from its corresponding block to emit a residue, a transformer which, in operation, performs a transformation on the residues emitted from the adder to emit transform coefficients, a quantizer o which, in operation, quantizes the transform coefficients to generate quantized transform coefficients, an entropy encoder which, in operation, encodes the quantized transform coefficients to generate a bit stream, and the prediction controller coupled to an interpreter, an intrapredictor, and a memory, in which the interpreter, in operation, generates a prediction of a current block based on a reference block in a coded reference image and the intrapredictor ,
in operation, it generates a prediction of a current block based on a reference block encoded in a current image, in which, the prediction controller, in operation, divides the blocks into a plurality of partitions that includes a first partition that has a non-rectangular shape and a second partition, predicts a first motion vector for the first partition and a second motion vector for the second partition, and encodes the first partition using the first motion vector and the second partition using the second motion vector.
[14]
14. Encoder according to claim 13, characterized by the fact that the second partition has a non-rectangular shape.
[15]
15. Encoder according to claim 13, characterized by the fact that the non-rectangular shape is a triangle.
[16]
16. Image coding method characterized by the fact that it comprises: dividing an image block into a plurality of partitions that includes a first partition that has a non-rectangular shape and a second partition; predict a first motion vector for the first partition and a second motion vector for the second partition; and encode the first partition using the first motion vector and the second partition using the second motion vector.
[17]
17. Method according to claim 16, characterized by the fact that the second partition has a non-rectangular shape.
[18]
18. Method according to claim 16, characterized
due to the fact that the non-rectangular shape is a triangle.
[19]
19. Image decoder characterized by the fact that it comprises: a circuit; a memory attached to the circuit; in which the circuit, in operation, performs: splitting an image block into a plurality of partitions that includes a first partition that has a non-rectangular shape and a second partition; predict a first motion vector for the first partition and a second motion vector for the second partition; and decoding the first partition using the first motion vector and the second partition using the second motion vector.
[20]
20. Decoder according to claim 19, characterized by the fact that the second partition has a non-rectangular shape.
[21]
21. Decoder according to claim 19, characterized by the fact that the non-rectangular shape is a triangle.
[22]
22. Decoder according to claim 19, characterized by the fact that the non-rectangular shape is selected from a group consisting of a triangle, a trapezoid, and a polygon with at least five sides and angles.
[23]
23. Decoder according to claim 19, characterized by the fact that the prediction includes selecting the first motion vector from a first set of motion vector candidates and selecting the second motion vector from a second set of motion motion vector candidates.
[24]
24. Decoder according to claim 23, characterized by the fact that the first set of motion vector candidates includes motion vectors from partitions neighboring the first partition, and the second set of motion vector candidates includes motion vectors from partitions neighboring the second partition.
[25]
25. Decoder according to claim 24, characterized by the fact that the partitions neighboring the first partition and the partitions neighboring the second partition are outside the image block from which the first partition and the second partition are divided.
[26]
26. Decoder according to claim 24, characterized by the fact that the neighboring partitions are one or both of spatially neighboring partitions and temporary neighboring partitions.
[27]
27. Decoder according to claim 23, characterized by the fact that the first set of motion vector candidates is the same as the second set of motion vector candidates.
[28]
28. Decoder according to claim 23, characterized by the fact that at least one of the first set of motion vector candidates and the second set of motion vector candidates is the same as a third set of candi - motion vector data for the image block.
[29]
29. Decoder according to claim 23, characterized by the fact that the circuit, in response to determining that the second partition has a non-rectangular shape, creates the second set of motion vector candidates that includes motion vectors from partitions neighboring the second partition exclusive to the first partition; and in response to determining that the second partition has a rectangular shape, create the second set of motion vector candidates that includes motion vectors from partitions neighboring the second partition inclusive of the first partition.
[30]
30. Image decoder characterized by the fact that it comprises: an entropy decoder which, in operation, receives and decodes a coded bit stream to obtain quantized transform coefficients, a quantizer and reverse transformer which, in operation, quantizes inverse the transform coefficients quantized to obtain transform coefficients and reverse transform the transform coefficients to obtain residues, an adder which, in operation, adds the residues emitted from the quantizer and inverse transformer and predictions emitted from a prediction controller to reconstruct the blocks, and the prediction controller coupled to an interpreter, an intrapredictor, and a memory, in which the interpreter, in operation, generates a prediction of a current block based on a reference block in an image decoded reference block and the intrapredictor, in operation, generates a prediction of a current block based on a decoded reference block in a current image, in which the prediction controller, in operation, divides an image block into a plurality of partitions that includes a first partition that has a non-rectangular shape and a second partition; predicts a first motion vector for the first partition and a second motion vector for the second partition; and decodes the first partition using the first motion vector and the second partition using the second motion vector.
[31]
31. Decoder according to claim 30, characterized by the fact that the second partition has a non-rectangular shape.
[32]
32. Decoder according to claim 30, character
characterized by the fact that the non-rectangular shape is a triangle.
[33]
33. Image decoding method characterized by the fact that it comprises: dividing an image block into a plurality of partitions that includes a first partition that has a non-rectangular shape and a second partition; predict a first motion vector for the first partition and a second motion vector for the second partition; and decoding the first partition using the first motion vector and the second partition using the second motion vector.
[34]
34. Method according to claim 33, characterized by the fact that the second partition has a non-rectangular shape.
[35]
35. Method according to claim 33, characterized by the fact that the non-rectangular shape is a triangle.
类似技术:
公开号 | 公开日 | 专利标题
BR112020001991A2|2020-08-18|image encoder, image decoder, image encoding method and image decoding method
WO2019039323A1|2019-02-28|Image encoder, image decoder, image encoding method, and image decoding method
WO2019039324A1|2019-02-28|Image encoder, image decoder, image encoding method, and image decoding method
JP6858248B2|2021-04-14|Image decoding device and decoding method
BR112020026686A2|2021-03-30|SYSTEM AND METHOD FOR VIDEO ENCODING
JP2021536191A|2021-12-23|Video coding system and method
BR112020025664A2|2021-03-23|encoder device, decoder device, encoding method and decoding method
BR112020013554A2|2020-12-01|encoder, decoder, encoding method, and decoding method
BR112020010935A2|2020-11-17|image encoding device, image decoding device, image encoding method, and image decoding method
BR112020000219A2|2020-07-07|encoding, encoding method, decoder and decoding method
BR112020019800A2|2021-01-05|ENCODER, DECODER, ENCODING METHOD AND DECODING METHOD
JP6798066B2|2020-12-09|Encoding device, decoding device, coding method, decoding method and picture compression program
JPWO2020162536A1|2021-10-28|Coding device, decoding device, coding method and decoding method
BR112020000876A2|2020-07-21|encoding device, decoding device, encoding method, and decoding method
BR112020016755A2|2020-12-15|ENCODER, DECODER, ENCODING METHOD AND DECODING METHOD
BR112020001579A2|2020-07-21|encoder, decoder, encoding method, decoding method
BR112020021718A2|2021-01-26|encoder, decoder, encoding method and decoding method
BR112021011019A2|2021-08-31|ENCODER, DECODER, ENCODING METHOD AND DECODING METHOD
BR112020022773A2|2021-02-02|encoder, decoder, encoding method and decoding method
BR112021001890A2|2021-04-27|encoder, decoder, encoding method and decoding method
BR112021014711A2|2021-09-28|ENCODER, DECODER, ENCODING METHOD AND DECODING METHOD
同族专利:
公开号 | 公开日
US20220046266A1|2022-02-10|
EP3673656A1|2020-07-01|
EP3673656A4|2020-07-01|
TW201921943A|2019-06-01|
US20200014950A1|2020-01-09|
WO2019039322A1|2019-02-28|
KR20200038943A|2020-04-14|
JP2020532225A|2020-11-05|
US11223844B2|2022-01-11|
CN110999303A|2020-04-10|
引用文献:
公开号 | 申请日 | 公开日 | 申请人 | 专利标题

US7756348B2|2006-10-30|2010-07-13|Hewlett-Packard Development Company, L.P.|Method for decomposing a video sequence frame|
JP2012023597A|2010-07-15|2012-02-02|Sony Corp|Image processing device and image processing method|
EP2763415B1|2011-09-29|2020-04-15|Sharp Kabushiki Kaisha|Image decoding apparatus for decoding partition information, image decoding method and image encoding apparatus|
WO2014107074A1|2013-01-04|2014-07-10|삼성전자 주식회사|Motion compensation method and device for encoding and decoding scalable video|
WO2015006884A1|2013-07-19|2015-01-22|Qualcomm Incorporated|3d video coding with partition-based depth inter coding|
EP3058726A1|2013-10-16|2016-08-24|Huawei Technologies Co., Ltd.|A method for determining a corner video part of a partition of a video coding block|
EP3293975A4|2015-09-08|2018-10-10|Samsung Electronics Co., Ltd.|Device and method for entropy encoding and decoding|
CN112956202A|2018-11-06|2021-06-11|北京字节跳动网络技术有限公司|Extension of inter prediction with geometric partitioning|
CN112219400A|2018-11-06|2021-01-12|北京字节跳动网络技术有限公司|Location dependent storage of motion information|
US10893298B2|2018-12-12|2021-01-12|Tencent America LLC|Method and apparatus for video coding|
US20210067776A1|2019-08-30|2021-03-04|Qualcomm Incorporated|Geometric partition mode with harmonized motion field storage and motion compensation|KR20200058417A|2017-10-16|2020-05-27|디지털인사이트 주식회사|Image encoding / decoding method, apparatus and recording medium storing bitstream|
CN108198145B|2017-12-29|2020-08-28|百度在线网络技术(北京)有限公司|Method and device for point cloud data restoration|
BR112020022773A2|2018-07-04|2021-02-02|Panasonic Intellectual Property Corporation Of America|encoder, decoder, encoding method and decoding method|
US10778977B2|2018-12-05|2020-09-15|Qualcomm Incorporated|Triangle motion information for video coding|
US20200213595A1|2018-12-31|2020-07-02|Comcast Cable Communications, Llc|Methods, Systems, And Apparatuses For Adaptive Processing Of Non-Rectangular Regions Within Coding Units|
KR20210114060A|2019-03-13|2021-09-17|엘지전자 주식회사|DMVR-based inter prediction method and device|
EP3944623A1|2019-03-22|2022-01-26|LG Electronics Inc.|Dmvr-based inter prediction method and apparatus|
KR20200115456A|2019-03-22|2020-10-07|엘지전자 주식회사|DMVR and BDOF-based inter prediction method and apparatus|
US11140409B2|2019-03-22|2021-10-05|Lg Electronics Inc.|DMVR and BDOF based inter prediction method and apparatus thereof|
CN113055682A|2019-06-24|2021-06-29|杭州海康威视数字技术股份有限公司|Encoding and decoding method, device and equipment|
WO2020260766A1|2019-06-26|2020-12-30|Nokia Technologies Oy|Method, apparatus, computer program product for storing motion vectors for video encoding|
CN114080807A|2019-07-02|2022-02-22|北京达佳互联信息技术有限公司|Method and device for video coding and decoding by utilizing triangular partition|
WO2021040037A1|2019-08-29|2021-03-04|日本放送協会|Encoding apparatus, decoding apparatus, and program|
KR20210034534A|2019-09-20|2021-03-30|한국전자통신연구원|Method and apparatus for encoding/decoding image and recording medium for storing bitstream|
KR20210035069A|2019-09-23|2021-03-31|주식회사 케이티|Method and apparatus for processing a video|
WO2021134393A1|2019-12-31|2021-07-08|Huawei Technologies Co., Ltd.|Method and apparatus of deblocking filtering between boundaries of blocks predicted using weighted prediction and non-rectangular merge modes|
WO2021219416A1|2020-04-30|2021-11-04|Huawei Technologies Co., Ltd.|Triangulation-based adaptive subsampling of dense motion vector fields|
法律状态:
2021-11-03| B350| Update of information on the portal [chapter 15.35 patent gazette]|
优先权:
申请号 | 申请日 | 专利标题
US201762548631P| true| 2017-08-22|2017-08-22|
US62/548,631|2017-08-22|
US201862698785P| true| 2018-07-16|2018-07-16|
US62/698,785|2018-07-16|
PCT/JP2018/030059|WO2019039322A1|2017-08-22|2018-08-10|Image encoder, image decoder, image encoding method, and image decoding method|
[返回顶部]